Skip to content

Instantly share code, notes, and snippets.

@mayurbhangale
Created June 21, 2019 12:44
Show Gist options
  • Select an option

  • Save mayurbhangale/358064393e335c5f57f21047773e0dd0 to your computer and use it in GitHub Desktop.

Select an option

Save mayurbhangale/358064393e335c5f57f21047773e0dd0 to your computer and use it in GitHub Desktop.
import xml.etree.ElementTree as ET
import pandas as pd
tree = ET.parse('Restaurants_Train.xml')
root = tree.getroot()
labeled_reviews = []
for sentence in root.findall("sentence"):
entry = {}
if sentence.find("aspectTerms"):
terms = [term.get("term") for term in sentence.find("aspectTerms").findall("aspectTerm")]
if sentence.find("aspectCategories"):
aspects = [aspect.get("category") for aspect in sentence.find("aspectCategories").findall("aspectCategory")]
entry["text"], entry["terms"], entry["aspects"]= sentence[0].text, terms, aspects
labeled_reviews.append(entry)
labeled_df = pd.DataFrame(labeled_reviews)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment