Skip to content

Instantly share code, notes, and snippets.

@dhargitai
Last active August 29, 2015 14:02
Show Gist options
  • Save dhargitai/cac3883fee7156c15572 to your computer and use it in GitHub Desktop.
Save dhargitai/cac3883fee7156c15572 to your computer and use it in GitHub Desktop.
Filter rows with invalid emails from CSV file
<?php
function filterRowsWithInvalidEmailsInCsv($fileName, $emailFieldIndex, $hasHeaderRow)
{
if (($handle = fopen("{$fileName}.csv", "r")) !== FALSE) {
$existingEmails = array();
$count = 0;
$filteredCsvFile = fopen("{$fileName}_filtered.csv", 'w');
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
if ($hasHeaderRow) {
fputcsv($filteredCsvFile, $data);
$hasHeaderRow = FALSE;
continue;
}
$email = strtolower($data[$emailFieldIndex]);
if (filter_var($email, FILTER_VALIDATE_EMAIL) &&
!in_array($email, $existingEmails)) {
$data[$emailFieldIndex] = $email;
fputcsv($filteredCsvFile, $data);
$existingEmails[] = $email;
$count++;
}
}
fclose($handle);
}
}
parse_str(implode('&', array_slice($argv, 1)), $_GET);
if (strpos($argv[1], '.csv') === FALSE) {
echo "\nInvalid arguments. Usage:\nphp -f {$argv[0]} CsvFileName [emailFieldIndex=... [hasHeaderRow=...]]\n\n";
exit;
}
$fileName = substr($argv[1], 0, strpos($argv[1], '.csv'));
$emailFieldIndex = isset($_GET['emailFieldIndex']) ? $_GET['emailFieldIndex'] : 0;
$hasHeaderRow = isset($_GET['hasHeaderRow']) ? $_GET['hasHeaderRow'] : FALSE;
filterRowsWithInvalidEmailsInCsv($fileName, $emailFieldIndex, $hasHeaderRow);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment