Created
January 17, 2010 19:38
-
-
Save callison-burch/279534 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| urdu_lines = open('/Users/ccb/Desktop/urdu_file').read().decode('utf8').split('\n') | |
| urdu_lines | |
| [u'\u0632\u0645\u0631\u06c1:\u0645\u0646\u062a\u062e\u0628_\u0645\u0642\u0627\u0644\u06d2', u'\u0632\u0645\u0631\u06c1:\u0627\u0645\u06cc\u062f\u0648\u0627\u0631_\u0628\u0631\u0627\u06d3_\u0645\u0646\u062a\u062e\u0628_\u0645\u0642\u0627\u0644\u06c1'] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| زمرہ:منتخب_مقالے | |
| زمرہ:امیدوار_براۓ_منتخب_مقال |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| urdu_featured_articles = query_category_members(urdu_lines[0], language='ur') | |
| urdu_featured_articles_candidates = query_category_members(urdu_lines[1], language='ur') | |
| def print_english_names(article_list, language='ur'): | |
| for article in article_list: | |
| lang_links = query_language_links(article, language) | |
| if 'en' in lang_links.keys(): | |
| print article, '\t', lang_links['en'] | |
| else: | |
| print article, '\t', "[no English equivalent]" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| >>> print_english_names(urdu_featured_articles) | |
| افغانستان Afghanistan | |
| اقوام متحدہ United Nations | |
| البرٹ آئنسٹائن Albert Einstein | |
| النور An-Nur | |
| امیر تیمور Timur | |
| بحر اوقیانوس Atlantic Ocean | |
| تونس Tunisia | |
| جمال عبدالناصر Gamal Abdel Nasser | |
| حبالہ محیط عالم World Wide Web | |
| سانچہ:منتخب مقالہ Template:Featured article | |
| سلطنت غزنویہ Ghaznavids | |
| سلطنت غوریہ Ghurids | |
| سلیمان اعظم Suleiman the Magnificent | |
| شام Syria | |
| صفوی سلطنت Safavid dynasty | |
| صلاح الدین ایوبی Saladin | |
| صلیبی جنگیں Crusades | |
| طنابی شمارندکاری Grid computing | |
| عراق Iraq | |
| عمر بن عبدالعزیز Umar ibn AbdulAziz | |
| غزوہ بدر Battle of Badr | |
| فیصل آباد Faisalabad | |
| لبنان Lebanon | |
| محمد اسد Muhammad Asad | |
| محمد صلی اللہ علیہ و آلہ و سلم Muhammad | |
| محمد علی (مکے باز) Muhammad Ali | |
| محمد علی پاشا Muhammad Ali of Egypt | |
| مسجد Mosque | |
| مملوک Mamluk | |
| میلکم ایکس Malcolm X | |
| میٹرکس Matrix (mathematics) | |
| نادر شاہ Nader Shah | |
| نظام شمسی Solar System | |
| ولید بن عبدالملک Al-Walid I | |
| print_english_names(urdu_featured_articles_candidates) | |
| 6 روزہ جنگ Six-Day War | |
| آگسٹو پنوشے Augusto Pinochet | |
| استنبول Istanbul | |
| الاحزاب Al-Ahzab | |
| انور سادات Anwar El Sadat | |
| اوپیک OPEC | |
| ایتھنز Athens | |
| ایران عراق جنگ Iran–Iraq War | |
| ایڈز (آسان) [no English equivalent] | |
| تاشقند Tashkent | |
| تحریک خلافت Khilafat Movement | |
| جابر بن حیان Geber | |
| جامع مسجد قرطبہ Great Mosque of Córdoba | |
| جسام Mammoth | |
| جنگ یوم کپور Yom Kippur War | |
| خلافت امویہ Umayyad Caliphate | |
| خلافت عباسیہ Abbasid Caliphate | |
| خیر الدین باربروسا Hayreddin Barbarossa | |
| روبالہ Robot | |
| روم Rome | |
| سانحۂ کربلا Battle of Karbala | |
| سریبرینیتسا کا قتل عام Srebrenica massacre | |
| سعودی عرب Saudi Arabia | |
| سلطنت عثمانیہ Ottoman Empire | |
| عبد الستار ایدھی Abdul Sattar Edhi | |
| فیصل بن عبدالعزیز السعود Faisal of Saudi Arabia | |
| قاہرہ Cairo | |
| كُلیَہ Kidney | |
| ماسکو Moscow | |
| مالی Mali | |
| محمد عاکف ارصوی Mehmet Akif Ersoy | |
| مملکت آصفیہ Hyderabad State | |
| نفخ Emphysema | |
| نیویارک شہر New York City | |
| پاکستان Pakistan | |
| چین People's Republic of China | |
| ڈھانچہ انسانی Human skeleton | |
| کراچی Karachi | |
| کرکٹ Cricket | |
| کریمیائی تاتاریوں کی جبری ملک بدری Deportation of Crimean Tatars | |
| گوادر Gwadar | |
| یورپی اتحاد European Union |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| page_views = {} | |
| for article in urdu_featured_articles: | |
| stats = query_page_view_stats(article, language='ur') | |
| page_views[article] = stats['total_views'] | |
| for article in urdu_featured_articles_candidates: | |
| stats = query_page_view_stats(article, language='ur') | |
| page_views[article] = stats['total_views'] | |
| for page_view in sorted(page_views.iteritems(), key=itemgetter(1), reverse=True)[0:15]: | |
| count = page_view[1] | |
| print count, "\t", | |
| print_english_names([page_view[0]]) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| 1112 پاکستان Pakistan | |
| 507 محمد صلی اللہ علیہ و آلہ و سلم Muhammad | |
| 342 کراچی Karachi | |
| 225 افغانستان Afghanistan | |
| 217 عراق Iraq | |
| 197 اقوام متحدہ United Nations | |
| 195 مسجد Mosque | |
| 191 سانحۂ کربلا Battle of Karbala | |
| 187 چین People's Republic of China | |
| 186 شام Syria | |
| 177 سلطنت عثمانیہ Ottoman Empire | |
| 171 سعودی عرب Saudi Arabia | |
| 160 صلاح الدین ایوبی Saladin | |
| 149 حبالہ محیط عالم World Wide Web | |
| 137 کرکٹ Cricket |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment