Parse Full Names with Ruby
May 31st, 2009
We do a lot of data migrations for Donor Tools – a lot of folks are coming to us from other systems, and they need their data ported over. Depending on the kind of system they were using, this can be anywhere from a quick script to a major data transformation headache.
Recently I needed a quick way to parse a full name string into name parts. Given a name like “Dr. Joe Donor, M.D.”, I wanted to end up with a name object with a prefix of “Dr.”, a suffix of “M.D.”, a first name of “Joe”, and a last name of “Donor”. Complicating matters, it also needed to be able to handle odd permutations like “Dr. and Mrs. Joe and Jane Donor”, etc.
The main problem that I had was with the “and”. If there was an “and”, it should put the preceding and following words together to form a single word. This turned out to be nearly impossible with regular expressions, but pretty easy with a combination of Ruby and regex. Here’s how it looks.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
class Name < ActiveRecord::Base def self.parse(name) return false unless name.is_a?(String) # First, split the name into an array parts = name.split # If any part is "and", then put together the two parts around it # For example, "Mr. and Mrs." or "Mickey and Minnie" parts.each_with_index do |part, i| if ["and", "&"].include?(part) and i > 0 p3 = parts.delete_at(i+1) p2 = parts.at(i) p1 = parts.delete_at(i-1) parts[i-1] = [p1, p2, p3].join(" ") end end # Build a hash of the remaining parts { :suffix => (s = parts.pop unless parts.last !~ /(\w+\.|[IVXLM]+|[A-Z]+)$/), :last_name => (l = parts.pop), :prefix => (p = parts.shift unless parts[0] !~ /^\w+\./), :first_name => (f = parts.shift), :middle_name => (m = parts.join(" ")) } end end |
Here’s the output:
1 2 3 4 5 6 7 8 9 10 11 |
Name.parse "Mr. Joe Donor" => {:middle_name=>"", :prefix=>"Mr.", :last_name=>"Donor", :suffix=>nil, :first_name=>"Joe"} Name.parse "Dr. and Mrs. Joe Donor, M.D." => {:middle_name=>"", :prefix=>"Dr. and Mrs.", :last_name=>"Donor,", :suffix=>"M.D.", :first_name=>"Joe"} Name.parse "Joe and Jane Donor" => {:middle_name=>"", :prefix=>nil, :last_name=>"Donor", :suffix=>nil, :first_name=>"Joe and Jane"} Name.parse "Joe and Jane Major-Donor" => {:middle_name=>"", :prefix=>nil, :last_name=>"Major-Donor", :suffix=>nil, :first_name=>"Joe and Jane"} |
Leave a Reply