rakeshsukla53 · June 20, 2020 21:54
diff --git a/Regex b/Regex
 Regular Expression  #use this link http://regexone.com/lesson/3 

 1- 

 match text 	abc123xyz 		?
 match text 	define “123” 		?
 match text 	var g = 123;

 .*   -> . means select any character, *one or more character 
 [a-z0-9]+ -> it will only cover abc123xyz because it will take any character between a-z and 0-9
 [a-z \d]+ -> it will take any character between a-z and \d means take any digit 
 abc123xyz, ^a.+ -> it will take any word which starts with a and then take anything 
 [a-z\W0-9]+ -> it will all the match. 1 - a-z will cover all letters from a-z, \W any Non-Alphanumeric character, 0-9 means take number

 2- your task 	text 		result
 match text 	cat. 		?
 match text 	896. 		?
 match text 	?=+. 		?
 skip text 	abc1 		?

 .+\. -> . means match any character and "\." means overrides the matching of the period. You want to escape the dot from the sequence it you need to add this "\."

 3- your task 	text 		result
 match text 	can 		?
 match text 	man 		?
 match text 	fan 		?
 skip text 	dan 		?
 skip text 	ran 		?
 skip text 	pan

 [cmf] -> it will only find the words starting from c, m, f. can, man and fan will match here

 4- Rakesh Ranjan Sukla 10/08/2015 

 I just want to extract the the date field out from it
 [0-9/]+ -> it will map only the date field and with '/' 

 5- match text 	hog 		✓
   match text 	dog 		✓
   skip text 	bog 		✓

 -> [hd] since it will only match hog and dog but not bog 

 6- match text 	Ana 		✓
 match text 	Bob 		✓
 match text 	Cpc 		✓
 skip text 	aax 		✓
 skip text 	bby 		✓
 skip text 	ccz 		✓

 -> [^a-z] it will match any character except a-z. 
 -> [ABC] it will match any character startwith ABC

 7- if you want to match the number of repetition 

 your task 	text 		result
 match text 	wazzzzup 	 ✓
 match text 	wazzzup 	 ✓
 skip text 	wazup 		 ✓

 -> z{2,4} this will map all the words where the z is repeated 2 to 4 times. {} is basically used for repetition 

 8- your task 	text 		result
 match text 	aaaabcc 		?
 match text 	aabbbbc 		?
 match text 	aacc                    ? 

 -> .+ will match all the words but you should not use it 
 -> [abc]+ it will also work 

 9- your task 	text 		result
 match text 	1 file found? 		?
 match text 	2 files found? 		?
 match text 	x files found? 		?

 ->  the pattern ab?c will match either the strings "abc" or "ac" because the b is considered optional
 -> \w files? found\  this will be used to match such patters and files? will match files and file. 

 10- your task 	text 		result
 match text 	1. abc 		✓
 match text 	2. abc 		✓
 match text 	3. abc 		✓
 skip text 	4.abc

 -> \d\.\s+abc this will match all the patterns except 4.abc since there is no white space between the period and the number of characters.

 11- your task 	text 		result
 match text 	Mission: successful 		✓
 skip text 	Last Mission: unsuccessful 		✓
 skip text 	Next Mission: successful upon capture of target 	✓ 

 -> ^Mission: successful$ this means that the word should start with Mission: and end with successful 
 ^ - it denotes the start of any string 
 $ - it means the end of any string 

 12- Match groups 

 Well, in regular expressions we can accomplish this by grouping characters and capturing them using the special ( and ) (parenthesis) metacharacters. You can place any pattern inside the parenthesis to capture that part of the pattern. In our example above, the pattern ^(IMG\d+\.png)$ would capture the full filename from start to finish. If we had only wanted to capture the filename but not the extension, we could use ^(IMG\d+)\.png$ instead. 

 your task 	text 	capture 	result
 capture text 	file_a_record_file.pdf 	file_a_record_file 	✓
 capture text 	file_yesterday.pdf 	file_yesterday 	✓
 skip text 	testfile_fake.pdf.tmp 		✓

 (\w+).pdf$ -> it will capture any alphanumeric character one or more than one and which 
 () -> this is also known as the capture group 

 13 - value.replace(/[^a-zA-Z0-9\n\.]/, " ")  replace all the special character in Open Refine
	Regular Expression #use this link http://regexone.com/lesson/3

	1-

	match text abc123xyz ?
	match text define “123” ?
	match text var g = 123;

	.* -> . means select any character, *one or more character
	[a-z0-9]+ -> it will only cover abc123xyz because it will take any character between a-z and 0-9
	[a-z \d]+ -> it will take any character between a-z and \d means take any digit
	abc123xyz, ^a.+ -> it will take any word which starts with a and then take anything
	[a-z\W0-9]+ -> it will all the match. 1 - a-z will cover all letters from a-z, \W any Non-Alphanumeric character, 0-9 means take number

	2- your task text result
	match text cat. ?
	match text 896. ?
	match text ?=+. ?
	skip text abc1 ?

	.+\. -> . means match any character and "\." means overrides the matching of the period. You want to escape the dot from the sequence it you need to add this "\."

	3- your task text result
	match text can ?
	match text man ?
	match text fan ?
	skip text dan ?
	skip text ran ?
	skip text pan

	[cmf] -> it will only find the words starting from c, m, f. can, man and fan will match here

	4- Rakesh Ranjan Sukla 10/08/2015

	I just want to extract the the date field out from it
	[0-9/]+ -> it will map only the date field and with '/'

	5- match text hog ✓
	match text dog ✓
	skip text bog ✓

	-> [hd] since it will only match hog and dog but not bog

	6- match text Ana ✓
	match text Bob ✓
	match text Cpc ✓
	skip text aax ✓
	skip text bby ✓
	skip text ccz ✓

	-> [^a-z] it will match any character except a-z.
	-> [ABC] it will match any character startwith ABC

	7- if you want to match the number of repetition

	your task text result
	match text wazzzzup ✓
	match text wazzzup ✓
	skip text wazup ✓

	-> z{2,4} this will map all the words where the z is repeated 2 to 4 times. {} is basically used for repetition

	8- your task text result
	match text aaaabcc ?
	match text aabbbbc ?
	match text aacc ?

	-> .+ will match all the words but you should not use it
	-> [abc]+ it will also work

	9- your task text result
	match text 1 file found? ?
	match text 2 files found? ?
	match text x files found? ?

	-> the pattern ab?c will match either the strings "abc" or "ac" because the b is considered optional
	-> \w files? found\ this will be used to match such patters and files? will match files and file.

	10- your task text result
	match text 1. abc ✓
	match text 2. abc ✓
	match text 3. abc ✓
	skip text 4.abc

	-> \d\.\s+abc this will match all the patterns except 4.abc since there is no white space between the period and the number of characters.

	11- your task text result
	match text Mission: successful ✓
	skip text Last Mission: unsuccessful ✓
	skip text Next Mission: successful upon capture of target ✓

	-> ^Mission: successful$ this means that the word should start with Mission: and end with successful
	^ - it denotes the start of any string
	$ - it means the end of any string

	12- Match groups

	Well, in regular expressions we can accomplish this by grouping characters and capturing them using the special ( and ) (parenthesis) metacharacters. You can place any pattern inside the parenthesis to capture that part of the pattern. In our example above, the pattern ^(IMG\d+\.png)$ would capture the full filename from start to finish. If we had only wanted to capture the filename but not the extension, we could use ^(IMG\d+)\.png$ instead.

	your task text capture result
	capture text file_a_record_file.pdf file_a_record_file ✓
	capture text file_yesterday.pdf file_yesterday ✓
	skip text testfile_fake.pdf.tmp ✓

	(\w+).pdf$ -> it will capture any alphanumeric character one or more than one and which
	() -> this is also known as the capture group

	13 - value.replace(/[^a-zA-Z0-9\n\.]/, " ") replace all the special character in Open Refine