Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save tatsuro-ueda/3325730 to your computer and use it in GitHub Desktop.
Save tatsuro-ueda/3325730 to your computer and use it in GitHub Desktop.
【文字コード】文字コードの不一致により正規表現のマッチングが失敗するときは

I also have encountered the regular expression exception, too. In my case, the problem was Character Encoding. So that I wrote a code to go well with several character encoding. Maybe this code help you.

+ (NSString *)encodedStringWithContentsOfURL:(NSURL *)url
{
    // Get the web page HTML
    NSData *data = [NSData dataWithContentsOfURL:url];
    
	// response
	int enc_arr[] = {
		NSUTF8StringEncoding,			// UTF-8
		NSShiftJISStringEncoding,		// Shift_JIS
		NSJapaneseEUCStringEncoding,	// EUC-JP
		NSISO2022JPStringEncoding,		// JIS
		NSUnicodeStringEncoding,		// Unicode
		NSASCIIStringEncoding			// ASCII
	};
	NSString *data_str = nil;
	int max = sizeof(enc_arr) / sizeof(enc_arr[0]);
	for (int i=0; i<max; i++) {
		data_str = [
                    [NSString alloc]
                    initWithData : data
                    encoding : enc_arr[i]
                    ];
		if (data_str!=nil) {
			break;
		}
	}
	return data_str;    
}

You can download the whole category library from GitHub and just run it. I wish this helps you.

https://github.com/weed/p120801_CharacterEncodingLibrary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment