Monday, August 10, 2009
String#extract - Simpler regex matching
String#extract is an extension to the core String class that simplifies extracting values from a string using regular expressions.
First, let me show you the method:
I always find using String#match and its Regex equivalents to be tedious and hard to follow.
Here's an example of trying to pull the various phone fields out of a user-entered phone number:
Instead, check out using the extract method:
Basically, #extract allows you to use a regex to pull a value out of a string instance, cleanly. Here are a few more examples:
So there you have it. Extract a single value with a plain regular expression, or one or more values using regexen with capture groups in them. And the syntax is clean and elegant. Hope you like it!
The code
First, let me show you the method:
# Returns the various captured elements from this string with the regex applied. # Usage: a,b = 'abc123 is cool'.extract(/([a-z]*)([0-9]*)/) # Result: a = 'abc', b = '123' # With no capture in regex, returns full match # If no match, returns nil def extract(regex) data = self.match(regex) return nil unless data if data.size > 1 return *(data.to_a[1..-1]) else return data[0] end end |
So, what does it do?
I always find using String#match and its Regex equivalents to be tedious and hard to follow.
Here's an example of trying to pull the various phone fields out of a user-entered phone number:
# A string, and a regex to parse it out number = "(800) 555-1212" regex = /([0-9]{3})?[^0-9]*([0-9]{3})[^0-9]*[0-9]{4}/ # This next bit is a bit verbose for my taste match = number.match(regex) if match area = match[1] prefix = match[2] suffix = match[3] end |
Instead, check out using the extract method:
# Same setup number = "(800) 555-1212" regex = /([0-9]{3})?[^0-9]*([0-9]{3})[^0-9]*[0-9]{4}/ # Short, sweet, and easy to read area, prefix, suffix = number.extract(regex) |
Basically, #extract allows you to use a regex to pull a value out of a string instance, cleanly. Here are a few more examples:
"Hey, Rob! Cool method!".extract(/R[a-z]*/) # "Rob" "$2,700.00".extract(/([0-9,]+)\.([0-9]{2})/) # ["2,700", "00"] first, last = "Rob Morris".extract(/([a-z]+)\s+([a-z]+)/i) # first = "Rob", last = "Morris" |
So there you have it. Extract a single value with a plain regular expression, or one or more values using regexen with capture groups in them. And the syntax is clean and elegant. Hope you like it!


0 Comments
Leave a comment