Skip to content

Instantly share code, notes, and snippets.

@mmiliaus
Created August 23, 2012 11:33
Show Gist options
  • Save mmiliaus/3435797 to your computer and use it in GitHub Desktop.
Save mmiliaus/3435797 to your computer and use it in GitHub Desktop.

Goal

Prerequisite

Installation

Ubuntu

Mac OS X

Windows

Debugging

var_dump($var) and exit are your best friends.

Test Cases

Before starting developing full name parsing algorithm, lets first define all cases that we expect to receive as an input.

<?php

$test_cases = array();                //=> []
$test_cases[] = 'Mr. John Doe';       //=> ['Mr. John Doe']
$test_cases[] = 'John Van Doe';       //=> ['Mr. John Doe', 'John Van Doe' ]
$test_cases[] = 'Mrs. Alice';         //=> ['Mr. John Doe', 'John Van Doe', 'Mrs. Alice']
$test_cases[] = 'Dr. John H. Watson'; //=> ['Mr. John Doe', 'John Van Doe', 'Mrs. Alice', 'Dr. John H. Watson']

?>

PHP basics

  • PHP code is always surrounded by <?php and ?> open/close delimeters
  • Every line ends with ;
  • Every variable starts with $
  • Strings are surrounded by ' or "
  • // used to add comments

Arrays

  • array() initialized $test_cases variable to an array
  • [] used to add an element to the end of the array

Full Name Parse Pseudo-Code

Below is a pseudo-code for the function that will be used to take full name as an input and return an associative array containing title, first name, last name.

title = try to extract "Mrs." or "Mr." or "Ms." from the full name
tokens = split full name by whitespaces
first_name = tokens[0]
last_name = concat all tokens into a string, excluding first one

parse_fullname($str)

Matching Title, RegEx

Person title can be extracted using regular expression, which is basically a pattern that involves some logic, which gives huge flexibility to extract almost anything from the text.

We will use following regex "/(Mrs|Ms|Mr|Dr)\./i". Short explanation:

  • /.../ are used to inform PHP that everything in between is considered a pattern
  • (..) are used to group sub-patterns
  • | -> translates to OR
  • /i. i is a flag, and is used to tell regex engine, that our pattern is case-insensitive. Meaning, it will match "mr.", "Mr", "MR." and etc.
  • \. is a pattern to match a dot in a string. Dot is special regex character, that by itself doesn't mean a dot in a string, thus it needs to be escaped with \ in order to be used as a pattern that matches dot.

Practice debugging

String into an array of words/tokens

Array into a string

Associative array/hash

Encapsulate into a function

Looping

foreach

Encapsulation

Skeleton

Properties

Methods

Adding UI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment