Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save analyticsindiamagazine/97f53c3a6ee5f63efea92e4159792f92 to your computer and use it in GitHub Desktop.
Save analyticsindiamagazine/97f53c3a6ee5f63efea92e4159792f92 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Predicting_News_Category_with_BERT_in_Tensorflow.ipynb",
"provenance": [],
"collapsed_sections": [
"wEhTK6Sypwqr",
"HAssmxxJp0yM",
"kyzTzLpyqJUf",
"pmFYvkylMwXn",
"ccp5trMwRtmr",
"_vrumsg9uygH"
]
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"accelerator": "GPU"
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "dCpvgG0vwXAZ",
"colab_type": "text"
},
"source": [
"#Predicting News Category With BERT IN Tensorflow\n",
"\n",
"---\n",
"\n",
"Bidirectional Encoder Representations from Transformers or BERT for short is a very popular NLP model from Google known for producing state-of-the-art results in a wide variety of NLP tasks.\n",
"\n",
"The importance of Natural Language Processing(NLP) is profound in the Artificial Intelligence domain. The most abundant data in the world today is in the form of texts and having a powerful text processing system is critical and is more than just a necessity.\n",
"\n",
"In this article we look at implementing a multi-class classification using the state-of-the-art model, BERT.\n",
"\n",
"---\n",
"\n",
"#####Pre-Requisites:\n",
"\n",
"#####An Understanding of BERT\n",
"---\n",
"\n",
"##About Dataset\n",
"\n",
"For this article, we will use MachineHack’s Predict The News Category Hackathon data. The data consists of a collection of news articles which are categorized into four sections. The features of the datasets are as follows:\n",
"\n",
"Size of training set: 7,628 records\n",
"Size of test set: 2,748 records\n",
"\n",
"FEATURES:\n",
"\n",
"STORY: A part of the main content of the article to be published as a piece of news.\n",
"SECTION: The genre/category the STORY falls in.\n",
"\n",
"There are four distinct sections where each story may fall in to. The Sections are labelled as follows :\n",
"Politics: 0\n",
"Technology: 1\n",
"Entertainment: 2\n",
"Business: 3\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "wEhTK6Sypwqr",
"colab_type": "text"
},
"source": [
"##Mounting Google Drive\n",
"\n",
"---\n",
"Here I have uploaded the dataset in to my Google Drive folder. To access the datasets we must first mount the drive in google colab. Type in and enter the following code to authenticate and mount your Google drive on to colab.\n",
"\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "5w8YSdcFPA_r",
"colab_type": "code",
"outputId": "a3a318a9-c3d3-4bdc-9095-0cc77d2578fa",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"from google.colab import drive\n",
"drive.mount(\"/GD\")"
],
"execution_count": 57,
"outputs": [
{
"output_type": "stream",
"text": [
"Drive already mounted at /GD; to attempt to forcibly remount, call drive.mount(\"/GD\", force_remount=True).\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HAssmxxJp0yM",
"colab_type": "text"
},
"source": [
"## Importing Necessary Libraries"
]
},
{
"cell_type": "code",
"metadata": {
"id": "hsZvic2YxnTz",
"colab_type": "code",
"outputId": "124ac0c4-df1e-40f4-a29f-68995302ac1e",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 51
}
},
"source": [
"import pandas as pd\n",
"import tensorflow as tf\n",
"import tensorflow_hub as hub\n",
"from datetime import datetime\n",
"from sklearn.model_selection import train_test_split\n",
"import os\n",
"\n",
"print(\"tensorflow version : \", tf.__version__)\n",
"print(\"tensorflow_hub version : \", hub.__version__)"
],
"execution_count": 58,
"outputs": [
{
"output_type": "stream",
"text": [
"tensorflow version : 1.15.0\n",
"tensorflow_hub version : 0.7.0\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "jviywGyWyKsA",
"colab_type": "code",
"outputId": "3d8b4c8c-0ad5-4cdf-d6be-55c336a82b1f",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 119
}
},
"source": [
"#Installing BERT module\n",
"!pip install bert-tensorflow"
],
"execution_count": 3,
"outputs": [
{
"output_type": "stream",
"text": [
"Collecting bert-tensorflow\n",
"\u001b[?25l Downloading https://files.pythonhosted.org/packages/a6/66/7eb4e8b6ea35b7cc54c322c816f976167a43019750279a8473d355800a93/bert_tensorflow-1.0.1-py2.py3-none-any.whl (67kB)\n",
"\r\u001b[K |████▉ | 10kB 16.6MB/s eta 0:00:01\r\u001b[K |█████████▊ | 20kB 1.8MB/s eta 0:00:01\r\u001b[K |██████████████▋ | 30kB 2.6MB/s eta 0:00:01\r\u001b[K |███████████████████▍ | 40kB 1.7MB/s eta 0:00:01\r\u001b[K |████████████████████████▎ | 51kB 2.1MB/s eta 0:00:01\r\u001b[K |█████████████████████████████▏ | 61kB 2.6MB/s eta 0:00:01\r\u001b[K |████████████████████████████████| 71kB 2.3MB/s \n",
"\u001b[?25hRequirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from bert-tensorflow) (1.12.0)\n",
"Installing collected packages: bert-tensorflow\n",
"Successfully installed bert-tensorflow-1.0.1\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "hhbGEfwgdEtw",
"colab_type": "code",
"outputId": "6f199f26-93b3-4580-f9b6-456f88989533",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 51
}
},
"source": [
"#Importing BERT modules\n",
"import bert\n",
"from bert import run_classifier\n",
"from bert import optimization\n",
"from bert import tokenization"
],
"execution_count": 4,
"outputs": [
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/bert/optimization.py:87: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.\n",
"\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kyzTzLpyqJUf",
"colab_type": "text"
},
"source": [
"##Setting The Output Directory\n",
"---\n",
"While fine-tuning the model, we will save the training checkpoints and the model in an output directory so that we can use the trained model for our predictions later.\n",
"\n",
"The following code block sets an output directory :\n",
"\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "US_EAnICvP7f",
"colab_type": "code",
"outputId": "ff060b68-a834-4e85-bd9f-54c2760c04e8",
"cellView": "both",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"# Set the output directory for saving model file\n",
"OUTPUT_DIR = '/GD/My Drive/Colab Notebooks/BERT/bert_news_category'\n",
"\n",
"#@markdown Whether or not to clear/delete the directory and create a new one\n",
"DO_DELETE = False #@param {type:\"boolean\"}\n",
"\n",
"if DO_DELETE:\n",
" try:\n",
" tf.gfile.DeleteRecursively(OUTPUT_DIR)\n",
" except:\n",
" pass\n",
"\n",
"tf.gfile.MakeDirs(OUTPUT_DIR)\n",
"print('***** Model output directory: {} *****'.format(OUTPUT_DIR))"
],
"execution_count": 5,
"outputs": [
{
"output_type": "stream",
"text": [
"***** Model output directory: /GD/My Drive/Colab Notebooks/BERT/bert_news_category *****\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pmFYvkylMwXn",
"colab_type": "text"
},
"source": [
"##Loading The Data\n",
"---\n",
"We will now load the data from a Google Drive directory and will also split the training set in to training and validation sets.\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "VIsetAbCam6y",
"colab_type": "code",
"colab": {}
},
"source": [
"train = pd.read_excel(\"/GD/My Drive/Colab Notebooks/News_category/Datasets/Data_Train.xlsx\")\n",
"test = pd.read_excel(\"/GD/My Drive/Colab Notebooks/News_category/Datasets/Data_Test.xlsx\")\n",
"\n",
"from sklearn.model_selection import train_test_split\n",
"\n",
"train, val = train_test_split(train, test_size = 0.2, random_state = 100)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "BP-kcaJ7bGza",
"colab_type": "code",
"outputId": "b090eaa4-eba6-4e86-8006-d4926d6d67ce",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"source": [
"#Training set sample\n",
"train.head(5)"
],
"execution_count": 7,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>STORY</th>\n",
" <th>SECTION</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>4359</th>\n",
" <td>Oil prices extended its rally to a five-month ...</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3520</th>\n",
" <td>Apple shares have risen more than 10% in March...</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4530</th>\n",
" <td>The veteran actor plays a retired army officer...</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6945</th>\n",
" <td>BuzzFeed could use a boost. Two years ago, the...</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2298</th>\n",
" <td>Bigg Boss Tamil 2 fame Mahat Raghavendra on We...</td>\n",
" <td>2</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" STORY SECTION\n",
"4359 Oil prices extended its rally to a five-month ... 3\n",
"3520 Apple shares have risen more than 10% in March... 1\n",
"4530 The veteran actor plays a retired army officer... 2\n",
"6945 BuzzFeed could use a boost. Two years ago, the... 1\n",
"2298 Bigg Boss Tamil 2 fame Mahat Raghavendra on We... 2"
]
},
"metadata": {
"tags": []
},
"execution_count": 7
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "i4g40DO4bJCR",
"colab_type": "code",
"outputId": "7d5e1cf5-4ed9-4815-dd32-224fc048212b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"source": [
"#Test set sample\n",
"test.head()"
],
"execution_count": 59,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>STORY</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2019 will see gadgets like gaming smartphones ...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>It has also unleashed a wave of changes in the...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>It can be confusing to pick the right smartpho...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>The mobile application is integrated with a da...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>We have rounded up some of the gadgets that sh...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" STORY\n",
"0 2019 will see gadgets like gaming smartphones ...\n",
"1 It has also unleashed a wave of changes in the...\n",
"2 It can be confusing to pick the right smartpho...\n",
"3 The mobile application is integrated with a da...\n",
"4 We have rounded up some of the gadgets that sh..."
]
},
"metadata": {
"tags": []
},
"execution_count": 59
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "e_rukDBlbvCj",
"colab_type": "code",
"outputId": "50f8f587-97cd-457f-b151-cf2b33adbc9d",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
}
},
"source": [
"print(\"Training Set Shape :\", train.shape)\n",
"print(\"Validation Set Shape :\", val.shape)\n",
"print(\"Test Set Shape :\", test.shape)"
],
"execution_count": 9,
"outputs": [
{
"output_type": "stream",
"text": [
"Training Set Shape : (6102, 2)\n",
"Validation Set Shape : (1526, 2)\n",
"Test Set Shape : (2748, 1)\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "k428NfLdcAqR",
"colab_type": "code",
"outputId": "14f3de25-f919-451f-c44f-9b278be84590",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"#Features in the dataset\n",
"train.columns"
],
"execution_count": 10,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"Index(['STORY', 'SECTION'], dtype='object')"
]
},
"metadata": {
"tags": []
},
"execution_count": 10
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "U9IwgBb-cOm3",
"colab_type": "code",
"outputId": "6b46ab7b-d27a-41ca-9f75-b17e36136464",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"#unique classes\n",
"train['SECTION'].unique()"
],
"execution_count": 11,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([3, 1, 2, 0])"
]
},
"metadata": {
"tags": []
},
"execution_count": 11
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "kToy7D-TSrn_",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 279
},
"outputId": "97bb564f-2c73-489b-db7b-52591588de53"
},
"source": [
"#Distribution of classes\n",
"train['SECTION'].value_counts().plot(kind = 'bar')"
],
"execution_count": 12,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7ff5921f9ba8>"
]
},
"metadata": {
"tags": []
},
"execution_count": 12
},
{
"output_type": "display_data",
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAD1CAYAAAC87SVQAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0\ndHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAMrUlEQVR4nO3df6zd9V3H8ecLKkSnkZLWprbdLtFG\n0/mDYVMw8w8MCSuwpDMxBExGQ9D6BziX+Mfq/KNmy7T/qBlxklSpA+MgZLrQhGbYNJplGrZelPBj\nbLZikTZA72QBF8wm7O0f99t4hve291fP4fJ+PpKbc87n+z3nfM435Hm+/Z7vOaSqkCT1cNGkJyBJ\nGh+jL0mNGH1JasToS1IjRl+SGjH6ktTImklP4FzWrVtXU1NTk56GJK0qjz/++Derav1cy97W0Z+a\nmmJ6enrS05CkVSXJ8/Mt8/COJDVi9CWpEaMvSY0YfUlqxOhLUiNGX5IaMfqS1IjRl6RG3tZfzroQ\npvY+MukpLMjJ/TdNegqS3oHc05ekRoy+JDVi9CWpEaMvSY0YfUlqxOhLUiNGX5IaMfqS1IjRl6RG\njL4kNWL0JakRoy9JjRh9SWrE6EtSI+eNfpItSf4+ydeSPJPkt4fxy5McSXJ8uFw7jCfJ3UlOJHky\nyVUjj7V7WP94kt0X7mVJkuaykD39N4DfqaptwDXAnUm2AXuBo1W1FTg63Aa4Adg6/O0B7oHZNwlg\nH3A1sAPYd/aNQpI0HueNflW9WFX/PFz/L+BZYBOwC7hvWO0+4EPD9V3A/TXrMeCyJBuBDwBHquqV\nqvoWcATYuaKvRpJ0Tos6pp9kCngf8BVgQ1W9OCx6CdgwXN8EvDByt1PD2HzjkqQxWXD0k/ww8DfA\nR6vqtdFlVVVArcSEkuxJMp1kemZmZiUeUpI0WFD0k/wAs8H/66r622H45eGwDcPlmWH8NLBl5O6b\nh7H5xr9PVR2oqu1VtX39+vWLeS2SpPNYyNk7Ae4Fnq2qPx5ZdAg4ewbObuDhkfHbhrN4rgFeHQ4D\nPQpcn2Tt8AHu9cOYJGlM1ixgnfcDHwaeSvLEMPZxYD/wUJI7gOeBm4dlh4EbgRPA68DtAFX1SpJP\nAseG9T5RVa+syKuQJC3IeaNfVV8GMs/i6+ZYv4A753msg8DBxUxQkrRy/EauJDVi9CWpEaMvSY0Y\nfUlqxOhLUiNGX5IaMfqS1IjRl6RGjL4kNWL0JakRoy9JjRh9SWrE6EtSI0Zfkhox+pLUiNGXpEaM\nviQ1YvQlqRGjL0mNGH1JasToS1IjRl+SGjH6ktSI0ZekRoy+JDVi9CWpEaMvSY0YfUlqxOhLUiNG\nX5IaMfqS1IjRl6RGjL4kNWL0JakRoy9JjRh9SWrE6EtSI0ZfkhpZc74VkhwEPgicqaqfGcZ+H/gN\nYGZY7eNVdXhY9rvAHcCbwEeq6tFhfCfwaeBi4C+qav/KvhRNwtTeRyY9hQU5uf+mSU9BeltYyJ7+\nZ4Gdc4z/SVVdOfydDf424BbgvcN9/izJxUkuBj4D3ABsA24d1pUkjdF59/Sr6ktJphb4eLuAB6vq\nO8C/JzkB7BiWnaiq5wCSPDis+7VFz1iStGTLOaZ/V5InkxxMsnYY2wS8MLLOqWFsvnFJ0hgtNfr3\nAD8BXAm8CPzRSk0oyZ4k00mmZ2Zmzn8HSdKCLSn6VfVyVb1ZVd8D/pz/O4RzGtgysurmYWy+8bke\n+0BVba+q7evXr1/K9CRJ81hS9JNsHLn5K8DTw/VDwC1JLk1yBbAV+CpwDNia5IoklzD7Ye+hpU9b\nkrQUCzll8wHgWmBdklPAPuDaJFcCBZwEfhOgqp5J8hCzH9C+AdxZVW8Oj3MX8Cizp2werKpnVvzV\nSJLOaSFn79w6x/C951j/U8Cn5hg/DBxe1OwkSSvKb+RKUiNGX5IaMfqS1IjRl6RGjL4kNWL0JakR\noy9JjRh9SWrE6EtSI0Zfkho5788wSBof//eTutDc05ekRoy+JDVi9CWpEaMvSY0YfUlqxOhLUiNG\nX5IaMfqS1IjRl6RGjL4kNWL0JakRoy9JjRh9SWrE6EtSI0Zfkhox+pLUiNGXpEaMviQ1YvQlqRGj\nL0mNGH1JasToS1IjRl+SGjH6ktSI0ZekRoy+JDVi9CWpEaMvSY2cN/pJDiY5k+TpkbHLkxxJcny4\nXDuMJ8ndSU4keTLJVSP32T2sfzzJ7gvzciRJ57KQPf3PAjvfMrYXOFpVW4Gjw22AG4Ctw98e4B6Y\nfZMA9gFXAzuAfWffKCRJ47PmfCtU1ZeSTL1leBdw7XD9PuAfgI8N4/dXVQGPJbksycZh3SNV9QpA\nkiPMvpE8sOxXIElzmNr7yKSnsCAn99801udb6jH9DVX14nD9JWDDcH0T8MLIeqeGsfnG/58ke5JM\nJ5memZlZ4vQkSXNZ9ge5w159rcBczj7egaraXlXb169fv1IPK0li6dF/eThsw3B5Zhg/DWwZWW/z\nMDbfuCRpjJYa/UPA2TNwdgMPj4zfNpzFcw3w6nAY6FHg+iRrhw9wrx/GJEljdN4PcpM8wOwHseuS\nnGL2LJz9wENJ7gCeB24eVj8M3AicAF4HbgeoqleSfBI4Nqz3ibMf6kqSxmchZ+/cOs+i6+ZYt4A7\n53mcg8DBRc1OkrSi/EauJDVi9CWpEaMvSY0YfUlqxOhLUiNGX5IaMfqS1IjRl6RGjL4kNWL0JakR\noy9JjRh9SWrE6EtSI0Zfkhox+pLUiNGXpEaMviQ1YvQlqRGjL0mNGH1JasToS1IjRl+SGjH6ktSI\n0ZekRoy+JDVi9CWpEaMvSY0YfUlqxOhLUiNGX5IaMfqS1IjRl6RGjL4kNWL0JakRoy9JjRh9SWrE\n6EtSI0ZfkhpZVvSTnEzyVJInkkwPY5cnOZLk+HC5dhhPkruTnEjyZJKrVuIFSJIWbiX29H+5qq6s\nqu3D7b3A0araChwdbgPcAGwd/vYA96zAc0uSFuFCHN7ZBdw3XL8P+NDI+P016zHgsiQbL8DzS5Lm\nsdzoF/B3SR5PsmcY21BVLw7XXwI2DNc3AS+M3PfUMPZ9kuxJMp1kemZmZpnTkySNWrPM+/9SVZ1O\n8mPAkSRfH11YVZWkFvOAVXUAOACwffv2Rd1XknRuy9rTr6rTw+UZ4AvADuDls4dthsszw+qngS0j\nd988jEmSxmTJ0U/yriQ/cvY6cD3wNHAI2D2stht4eLh+CLhtOIvnGuDVkcNAkqQxWM7hnQ3AF5Kc\nfZzPVdUXkxwDHkpyB/A8cPOw/mHgRuAE8Dpw+zKeW5K0BEuOflU9B/z8HOP/CVw3x3gBdy71+SRJ\ny+c3ciWpEaMvSY0YfUlqxOhLUiNGX5IaMfqS1IjRl6RGjL4kNWL0JakRoy9JjRh9SWrE6EtSI0Zf\nkhox+pLUiNGXpEaMviQ1YvQlqRGjL0mNGH1JasToS1IjRl+SGjH6ktSI0ZekRoy+JDVi9CWpEaMv\nSY0YfUlqxOhLUiNGX5IaMfqS1IjRl6RGjL4kNWL0JakRoy9JjRh9SWrE6EtSI0Zfkhox+pLUyNij\nn2Rnkm8kOZFk77ifX5I6G2v0k1wMfAa4AdgG3Jpk2zjnIEmdjXtPfwdwoqqeq6rvAg8Cu8Y8B0lq\nK1U1vidLfhXYWVW/Ptz+MHB1Vd01ss4eYM9w86eAb4xtgku3DvjmpCfxDuL2XFluz5WzWrble6pq\n/VwL1ox7JudTVQeAA5Oex2Ikma6q7ZOexzuF23NluT1XzjthW4778M5pYMvI7c3DmCRpDMYd/WPA\n1iRXJLkEuAU4NOY5SFJbYz28U1VvJLkLeBS4GDhYVc+Mcw4XyKo6HLUKuD1Xlttz5az6bTnWD3Il\nSZPlN3IlqRGjL0mNGH1JauRtd56++kny08Am4CtV9e2R8Z1V9cXJzWz1GbblLma3J8yeEn2oqp6d\n3KxWryQ7gKqqY8NPxuwEvl5Vhyc8tSVzT38FJbl90nNYbZJ8BHgY+C3g6SSjP8vxB5OZ1eqU5GPM\n/rRJgK8OfwEe8McNFy/JPuBu4J4kfwj8KfAuYG+S35vo5JbBs3dWUJL/qKp3T3oeq0mSp4BfrKpv\nJ5kCPg/8VVV9Osm/VNX7JjrBVSTJvwLvrar/ecv4JcAzVbV1MjNbnYb/Nq8ELgVeAjZX1WtJfpDZ\nf5X+3EQnuEQe3lmkJE/OtwjYMM65vENcdPaQTlWdTHIt8Pkk72F2m2rhvgf8OPD8W8Y3Dsu0OG9U\n1ZvA60n+rapeA6iq/06yaren0V+8DcAHgG+9ZTzAP41/Oqvey0murKonAIY9/g8CB4GfnezUVp2P\nAkeTHAdeGMbeDfwkcNe899J8vpvkh6rqdeAXzg4m+VFW8Zuoh3cWKcm9wF9W1ZfnWPa5qvq1CUxr\n1Uqymdk9qpfmWPb+qvrHCUxr1UpyEbM/YT76Qe6xYY9Vi5Dk0qr6zhzj64CNVfXUBKa1bEZfkhrx\n7B1JasToS1IjRl+SGjH6ktSI0ZekRv4XHggHDSdX50IAAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"tags": []
}
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "IuMOGwFui4it",
"colab_type": "code",
"colab": {}
},
"source": [
"DATA_COLUMN = 'STORY'\n",
"LABEL_COLUMN = 'SECTION'\n",
"# The list containing all the classes (train['SECTION'].unique())\n",
"label_list = [0, 1, 2, 3]"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "V399W0rqNJ-Z",
"colab_type": "text"
},
"source": [
"## Data Preprocessing\n",
"\n",
"BERT model accept only a specific type of input and the datasets are usually structured to have have the following four features:\n",
"\n",
"* guid : A unique id that represents an observation.\n",
"* text_a : The text we need to classify into given categories\n",
"* text_b: It is used when we're training a model to understand the relationship between sentences and it does not apply for classification problems.\n",
"* label: It consists of the labels or classes or categories that a given text belongs to.\n",
" \n",
"In our dataset we have text_a and label. The following code block will create objects for each of the above mentioned features for all the records in our dataset using the InputExample class provided in the BERT library.\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "p9gEt5SmM6i6",
"colab_type": "code",
"colab": {}
},
"source": [
"train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None,\n",
" text_a = x[DATA_COLUMN], \n",
" text_b = None, \n",
" label = x[LABEL_COLUMN]), axis = 1)\n",
"\n",
"val_InputExamples = val.apply(lambda x: bert.run_classifier.InputExample(guid=None, \n",
" text_a = x[DATA_COLUMN], \n",
" text_b = None, \n",
" label = x[LABEL_COLUMN]), axis = 1)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "K50MFQXXWFJM",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 221
},
"outputId": "d4636ef7-9099-410d-c393-ddb4f713c255"
},
"source": [
"train_InputExamples"
],
"execution_count": 15,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"4359 <bert.run_classifier.InputExample object at 0x...\n",
"3520 <bert.run_classifier.InputExample object at 0x...\n",
"4530 <bert.run_classifier.InputExample object at 0x...\n",
"6945 <bert.run_classifier.InputExample object at 0x...\n",
"2298 <bert.run_classifier.InputExample object at 0x...\n",
" ... \n",
"79 <bert.run_classifier.InputExample object at 0x...\n",
"3927 <bert.run_classifier.InputExample object at 0x...\n",
"5955 <bert.run_classifier.InputExample object at 0x...\n",
"6936 <bert.run_classifier.InputExample object at 0x...\n",
"5640 <bert.run_classifier.InputExample object at 0x...\n",
"Length: 6102, dtype: object"
]
},
"metadata": {
"tags": []
},
"execution_count": 15
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "a7UC2dnVRsoZ",
"colab_type": "code",
"outputId": "105924ba-0e83-4f65-f7c9-ea7089866043",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 258
}
},
"source": [
"print(\"Row 0 - guid of training set : \", train_InputExamples.iloc[0].guid)\n",
"print(\"\\n__________\\nRow 0 - text_a of training set : \", train_InputExamples.iloc[0].text_a)\n",
"print(\"\\n__________\\nRow 0 - text_b of training set : \", train_InputExamples.iloc[0].text_b)\n",
"print(\"\\n__________\\nRow 0 - label of training set : \", train_InputExamples.iloc[0].label)"
],
"execution_count": 16,
"outputs": [
{
"output_type": "stream",
"text": [
"Row 0 - guid of training set : None\n",
"\n",
"__________\n",
"Row 0 - text_a of training set : Oil prices extended its rally to a five-month high as conflict in Libya increased the risk of new supply outages\n",
"\n",
"\n",
"Indian rupee today weakened marginally against US dollar, tracking losses in other Asian currencies as traders awaited further details on a possible US-China trade deal. Higher crude oil prices also dampened sentiment. At 9.15 am, the rupee was trading at 69.46 a dollar, down 0.34% from its previous close of 69.23. The home currency opened at 69.34 a dollar.\n",
"\n",
"__________\n",
"Row 0 - text_b of training set : None\n",
"\n",
"__________\n",
"Row 0 - label of training set : 3\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qMWiDtpyQSoU",
"colab_type": "text"
},
"source": [
"We will now get down to business with the pretrained BERT. In this example we will use the ```bert_uncased_L-12_H-768_A-12/1``` model. To check all available versions click [here](https://tfhub.dev/s?network-architecture=transformer&publisher=google).\n",
"\n",
"We will be using the vocab.txt file in the model to map the words in the dataset to indexes. Also the loaded BERT model is trained on uncased/lowercase data and hence the data we feed to train the model should also be of lowercase.\n",
"\n",
"---\n",
"\n",
"The following code block loads the pre-trained BERT model and initializers a tokenizer object for tokenizing the texts.\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "IhJSe0QHNG7U",
"colab_type": "code",
"outputId": "5591de28-d634-4e39-df73-6576d7f4cfb3",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 119
}
},
"source": [
"\n",
"# This is a path to an uncased (all lowercase) version of BERT\n",
"BERT_MODEL_HUB = \"https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1\"\n",
"\n",
"def create_tokenizer_from_hub_module():\n",
" \"\"\"Get the vocab file and casing info from the Hub module.\"\"\"\n",
" with tf.Graph().as_default():\n",
" bert_module = hub.Module(BERT_MODEL_HUB)\n",
" tokenization_info = bert_module(signature=\"tokenization_info\", as_dict=True)\n",
" with tf.Session() as sess:\n",
" vocab_file, do_lower_case = sess.run([tokenization_info[\"vocab_file\"],\n",
" tokenization_info[\"do_lower_case\"]])\n",
" \n",
" return bert.tokenization.FullTokenizer(\n",
" vocab_file=vocab_file, do_lower_case=do_lower_case)\n",
"\n",
"tokenizer = create_tokenizer_from_hub_module()"
],
"execution_count": 17,
"outputs": [
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saver not created because there are no variables in the graph to restore\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saver not created because there are no variables in the graph to restore\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/bert/tokenization.py:125: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.\n",
"\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/bert/tokenization.py:125: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.\n",
"\n"
],
"name": "stderr"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "t3T3jSpjSxmd",
"colab_type": "code",
"outputId": "bc437cfd-4c1a-4384-b080-0881c99eb34f",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 54
}
},
"source": [
"#Here is what the tokenised sample of the first training set observation looks like\n",
"print(tokenizer.tokenize(train_InputExamples.iloc[0].text_a))"
],
"execution_count": 31,
"outputs": [
{
"output_type": "stream",
"text": [
"['oil', 'prices', 'extended', 'its', 'rally', 'to', 'a', 'five', '-', 'month', 'high', 'as', 'conflict', 'in', 'libya', 'increased', 'the', 'risk', 'of', 'new', 'supply', 'out', '##ages', 'indian', 'ru', '##pee', 'today', 'weakened', 'marginal', '##ly', 'against', 'us', 'dollar', ',', 'tracking', 'losses', 'in', 'other', 'asian', 'cu', '##rre', '##ncies', 'as', 'traders', 'awaited', 'further', 'details', 'on', 'a', 'possible', 'us', '-', 'china', 'trade', 'deal', '.', 'higher', 'crude', 'oil', 'prices', 'also', 'damp', '##ened', 'sentiment', '.', 'at', '9', '.', '15', 'am', ',', 'the', 'ru', '##pee', 'was', 'trading', 'at', '69', '.', '46', 'a', 'dollar', ',', 'down', '0', '.', '34', '%', 'from', 'its', 'previous', 'close', 'of', '69', '.', '23', '.', 'the', 'home', 'currency', 'opened', 'at', '69', '.', '34', 'a', 'dollar', '.']\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mtvrR5eusZPO",
"colab_type": "text"
},
"source": [
"We will now format out text in to input features which the BERT model expects. We will also set a sequence length which will be the length of the input features."
]
},
{
"cell_type": "code",
"metadata": {
"id": "LL5W8gEGRTAf",
"colab_type": "code",
"outputId": "ae954549-b82e-412c-9772-168a3664a4e2",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
}
},
"source": [
"# We'll set sequences to be at most 128 tokens long.\n",
"MAX_SEQ_LENGTH = 128\n",
"\n",
"# Convert our train and validation features to InputFeatures that BERT understands.\n",
"train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)\n",
"\n",
"val_features = bert.run_classifier.convert_examples_to_features(val_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)"
],
"execution_count": 20,
"outputs": [
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/bert/run_classifier.py:774: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.\n",
"\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/bert/run_classifier.py:774: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.\n",
"\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Writing example 0 of 6102\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Writing example 0 of 6102\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] oil prices extended its rally to a five - month high as conflict in libya increased the risk of new supply out ##ages indian ru ##pee today weakened marginal ##ly against us dollar , tracking losses in other asian cu ##rre ##ncies as traders awaited further details on a possible us - china trade deal . higher crude oil prices also damp ##ened sentiment . at 9 . 15 am , the ru ##pee was trading at 69 . 46 a dollar , down 0 . 34 % from its previous close of 69 . 23 . the home currency opened at 69 . 34 a dollar . [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] oil prices extended its rally to a five - month high as conflict in libya increased the risk of new supply out ##ages indian ru ##pee today weakened marginal ##ly against us dollar , tracking losses in other asian cu ##rre ##ncies as traders awaited further details on a possible us - china trade deal . higher crude oil prices also damp ##ened sentiment . at 9 . 15 am , the ru ##pee was trading at 69 . 46 a dollar , down 0 . 34 % from its previous close of 69 . 23 . the home currency opened at 69 . 34 a dollar . [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 3514 7597 3668 2049 8320 2000 1037 2274 1011 3204 2152 2004 4736 1999 12917 3445 1996 3891 1997 2047 4425 2041 13923 2796 21766 28084 2651 11855 14785 2135 2114 2149 7922 1010 9651 6409 1999 2060 4004 12731 14343 14767 2004 13066 19605 2582 4751 2006 1037 2825 2149 1011 2859 3119 3066 1012 3020 13587 3514 7597 2036 10620 6675 15792 1012 2012 1023 1012 2321 2572 1010 1996 21766 28084 2001 6202 2012 6353 1012 4805 1037 7922 1010 2091 1014 1012 4090 1003 2013 2049 3025 2485 1997 6353 1012 2603 1012 1996 2188 9598 2441 2012 6353 1012 4090 1037 7922 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 3514 7597 3668 2049 8320 2000 1037 2274 1011 3204 2152 2004 4736 1999 12917 3445 1996 3891 1997 2047 4425 2041 13923 2796 21766 28084 2651 11855 14785 2135 2114 2149 7922 1010 9651 6409 1999 2060 4004 12731 14343 14767 2004 13066 19605 2582 4751 2006 1037 2825 2149 1011 2859 3119 3066 1012 3020 13587 3514 7597 2036 10620 6675 15792 1012 2012 1023 1012 2321 2572 1010 1996 21766 28084 2001 6202 2012 6353 1012 4805 1037 7922 1010 2091 1014 1012 4090 1003 2013 2049 3025 2485 1997 6353 1012 2603 1012 1996 2188 9598 2441 2012 6353 1012 4090 1037 7922 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 3 (id = 3)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 3 (id = 3)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] apple shares have risen more than 10 % in march to a four - month high ahead of monday ’ s event . here ’ s what to watch out for : the service will focus on original content , including tv shows and movies from producers such as damien cha ##zell ##e , m . night shy ##ama ##lan , and op ##rah . there are documentaries , such as ' elephant queen , ' and animation ##s , like “ wolf ##walker ##s \" by oscar - nominated studio cartoon saloon , along with a re - imagining of the “ amazing stories \" from steven spielberg , and a drama starring jennifer an ##isto ##n and reese with ##ers ##poo ##n . an [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] apple shares have risen more than 10 % in march to a four - month high ahead of monday ’ s event . here ’ s what to watch out for : the service will focus on original content , including tv shows and movies from producers such as damien cha ##zell ##e , m . night shy ##ama ##lan , and op ##rah . there are documentaries , such as ' elephant queen , ' and animation ##s , like “ wolf ##walker ##s \" by oscar - nominated studio cartoon saloon , along with a re - imagining of the “ amazing stories \" from steven spielberg , and a drama starring jennifer an ##isto ##n and reese with ##ers ##poo ##n . an [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 6207 6661 2031 13763 2062 2084 2184 1003 1999 2233 2000 1037 2176 1011 3204 2152 3805 1997 6928 1521 1055 2724 1012 2182 1521 1055 2054 2000 3422 2041 2005 1024 1996 2326 2097 3579 2006 2434 4180 1010 2164 2694 3065 1998 5691 2013 6443 2107 2004 12587 15775 24085 2063 1010 1049 1012 2305 11004 8067 5802 1010 1998 6728 10404 1012 2045 2024 15693 1010 2107 2004 1005 10777 3035 1010 1005 1998 7284 2015 1010 2066 1523 4702 26965 2015 1000 2011 7436 1011 4222 2996 9476 17078 1010 2247 2007 1037 2128 1011 16603 1997 1996 1523 6429 3441 1000 2013 7112 28740 1010 1998 1037 3689 4626 7673 2019 20483 2078 1998 15883 2007 2545 24667 2078 1012 2019 102\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 6207 6661 2031 13763 2062 2084 2184 1003 1999 2233 2000 1037 2176 1011 3204 2152 3805 1997 6928 1521 1055 2724 1012 2182 1521 1055 2054 2000 3422 2041 2005 1024 1996 2326 2097 3579 2006 2434 4180 1010 2164 2694 3065 1998 5691 2013 6443 2107 2004 12587 15775 24085 2063 1010 1049 1012 2305 11004 8067 5802 1010 1998 6728 10404 1012 2045 2024 15693 1010 2107 2004 1005 10777 3035 1010 1005 1998 7284 2015 1010 2066 1523 4702 26965 2015 1000 2011 7436 1011 4222 2996 9476 17078 1010 2247 2007 1037 2128 1011 16603 1997 1996 1523 6429 3441 1000 2013 7112 28740 1010 1998 1037 3689 4626 7673 2019 20483 2078 1998 15883 2007 2545 24667 2078 1012 2019 102\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 1 (id = 1)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 1 (id = 1)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] the veteran actor plays a retired army officer who enjoys a formidable reputation in the village , thanks to the seemingly never - ending stories of his battle exploits [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] the veteran actor plays a retired army officer who enjoys a formidable reputation in the village , thanks to the seemingly never - ending stories of his battle exploits [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 1996 8003 3364 3248 1037 3394 2390 2961 2040 15646 1037 18085 5891 1999 1996 2352 1010 4283 2000 1996 9428 2196 1011 4566 3441 1997 2010 2645 20397 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 1996 8003 3364 3248 1037 3394 2390 2961 2040 15646 1037 18085 5891 1999 1996 2352 1010 4283 2000 1996 9428 2196 1011 4566 3441 1997 2010 2645 20397 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 2 (id = 2)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 2 (id = 2)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] buzz ##fe ##ed could use a boost . two years ago , the company was valued at $ 1 . 7 billion and its prospects seemed bright . but the digital - media industry has gotten tough ##er since then . the company laid off 100 people last fall and shut down its podcast team in september . before buzz ##fe ##ed investors can find an exit through a sale or public offering , the company needs to prove it can develop a diverse mix of revenue from creating tv shows and films , commerce , and licensing or mer ##chan ##dis ##ing . in an interview , chief executive jonah pere ##tti declined to discuss his company ’ s revenue for 2018 , but said [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] buzz ##fe ##ed could use a boost . two years ago , the company was valued at $ 1 . 7 billion and its prospects seemed bright . but the digital - media industry has gotten tough ##er since then . the company laid off 100 people last fall and shut down its podcast team in september . before buzz ##fe ##ed investors can find an exit through a sale or public offering , the company needs to prove it can develop a diverse mix of revenue from creating tv shows and films , commerce , and licensing or mer ##chan ##dis ##ing . in an interview , chief executive jonah pere ##tti declined to discuss his company ’ s revenue for 2018 , but said [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 12610 7959 2098 2071 2224 1037 12992 1012 2048 2086 3283 1010 1996 2194 2001 11126 2012 1002 1015 1012 1021 4551 1998 2049 16746 2790 4408 1012 2021 1996 3617 1011 2865 3068 2038 5407 7823 2121 2144 2059 1012 1996 2194 4201 2125 2531 2111 2197 2991 1998 3844 2091 2049 16110 2136 1999 2244 1012 2077 12610 7959 2098 9387 2064 2424 2019 6164 2083 1037 5096 2030 2270 5378 1010 1996 2194 3791 2000 6011 2009 2064 4503 1037 7578 4666 1997 6599 2013 4526 2694 3065 1998 3152 1010 6236 1010 1998 13202 2030 21442 14856 10521 2075 1012 1999 2019 4357 1010 2708 3237 15617 23976 6916 6430 2000 6848 2010 2194 1521 1055 6599 2005 2760 1010 2021 2056 102\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 12610 7959 2098 2071 2224 1037 12992 1012 2048 2086 3283 1010 1996 2194 2001 11126 2012 1002 1015 1012 1021 4551 1998 2049 16746 2790 4408 1012 2021 1996 3617 1011 2865 3068 2038 5407 7823 2121 2144 2059 1012 1996 2194 4201 2125 2531 2111 2197 2991 1998 3844 2091 2049 16110 2136 1999 2244 1012 2077 12610 7959 2098 9387 2064 2424 2019 6164 2083 1037 5096 2030 2270 5378 1010 1996 2194 3791 2000 6011 2009 2064 4503 1037 7578 4666 1997 6599 2013 4526 2694 3065 1998 3152 1010 6236 1010 1998 13202 2030 21442 14856 10521 2075 1012 1999 2019 4357 1010 2708 3237 15617 23976 6916 6430 2000 6848 2010 2194 1521 1055 6599 2005 2760 1010 2021 2056 102\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 1 (id = 1)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 1 (id = 1)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] big ##g boss tamil 2 fame maha ##t rag ##haven ##dra on wednesday took to twitter to announce his engagement with longtime girlfriend pr ##achi mis ##hra the couple ’ s relationship had been in the spotlight since maha ##t ’ s participation in the second season of big ##g boss tamil on the show , he court ##ed many controversies due to his ill - tempered nature , which even led to physical fights [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] big ##g boss tamil 2 fame maha ##t rag ##haven ##dra on wednesday took to twitter to announce his engagement with longtime girlfriend pr ##achi mis ##hra the couple ’ s relationship had been in the spotlight since maha ##t ’ s participation in the second season of big ##g boss tamil on the show , he court ##ed many controversies due to his ill - tempered nature , which even led to physical fights [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 2502 2290 5795 6008 1016 4476 24404 2102 17768 14650 7265 2006 9317 2165 2000 10474 2000 14970 2010 8147 2007 11155 6513 10975 21046 28616 13492 1996 3232 1521 1055 3276 2018 2042 1999 1996 17763 2144 24404 2102 1521 1055 6577 1999 1996 2117 2161 1997 2502 2290 5795 6008 2006 1996 2265 1010 2002 2457 2098 2116 25962 2349 2000 2010 5665 1011 22148 3267 1010 2029 2130 2419 2000 3558 9590 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 2502 2290 5795 6008 1016 4476 24404 2102 17768 14650 7265 2006 9317 2165 2000 10474 2000 14970 2010 8147 2007 11155 6513 10975 21046 28616 13492 1996 3232 1521 1055 3276 2018 2042 1999 1996 17763 2144 24404 2102 1521 1055 6577 1999 1996 2117 2161 1997 2502 2290 5795 6008 2006 1996 2265 1010 2002 2457 2098 2116 25962 2349 2000 2010 5665 1011 22148 3267 1010 2029 2130 2419 2000 3558 9590 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 2 (id = 2)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 2 (id = 2)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Writing example 0 of 1526\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Writing example 0 of 1526\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] traders said , reduced demand from local jewel ##lers as well as retail buyers led to the slide in gold prices . in the international market , spot gold was trading at usd 1 , 299 . 30 an ounce while silver was at usd 15 . 31 an ounce in new york . [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] traders said , reduced demand from local jewel ##lers as well as retail buyers led to the slide in gold prices . in the international market , spot gold was trading at usd 1 , 299 . 30 an ounce while silver was at usd 15 . 31 an ounce in new york . [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 13066 2056 1010 4359 5157 2013 2334 13713 12910 2004 2092 2004 7027 17394 2419 2000 1996 7358 1999 2751 7597 1012 1999 1996 2248 3006 1010 3962 2751 2001 6202 2012 13751 1015 1010 25926 1012 2382 2019 19471 2096 3165 2001 2012 13751 2321 1012 2861 2019 19471 1999 2047 2259 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 13066 2056 1010 4359 5157 2013 2334 13713 12910 2004 2092 2004 7027 17394 2419 2000 1996 7358 1999 2751 7597 1012 1999 1996 2248 3006 1010 3962 2751 2001 6202 2012 13751 1015 1010 25926 1012 2382 2019 19471 2096 3165 2001 2012 13751 2321 1012 2861 2019 19471 1999 2047 2259 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 3 (id = 3)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 3 (id = 3)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] a recent amazon job posting , seeking a quality assurance manager for alexa data services in bucharest , describes the role humans play : “ every day she [ alexa ] listen ##s to thousands of people talking to her about different topics and different languages , and she needs our help to make sense of it all . \" the want ad continues : “ this is big data handling like you ’ ve never seen it . we ’ re creating , labeling , cu ##rating and analyzing vast quantities of speech on a daily basis . \" amazon ’ s review process for speech data begins when alexa pulls a random , small sampling of customer voice recordings and sends the audio files [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] a recent amazon job posting , seeking a quality assurance manager for alexa data services in bucharest , describes the role humans play : “ every day she [ alexa ] listen ##s to thousands of people talking to her about different topics and different languages , and she needs our help to make sense of it all . \" the want ad continues : “ this is big data handling like you ’ ve never seen it . we ’ re creating , labeling , cu ##rating and analyzing vast quantities of speech on a daily basis . \" amazon ’ s review process for speech data begins when alexa pulls a random , small sampling of customer voice recordings and sends the audio files [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 1037 3522 9733 3105 14739 1010 6224 1037 3737 16375 3208 2005 24969 2951 2578 1999 14261 1010 5577 1996 2535 4286 2377 1024 1523 2296 2154 2016 1031 24969 1033 4952 2015 2000 5190 1997 2111 3331 2000 2014 2055 2367 7832 1998 2367 4155 1010 1998 2016 3791 2256 2393 2000 2191 3168 1997 2009 2035 1012 1000 1996 2215 4748 4247 1024 1523 2023 2003 2502 2951 8304 2066 2017 1521 2310 2196 2464 2009 1012 2057 1521 2128 4526 1010 28847 1010 12731 15172 1998 20253 6565 12450 1997 4613 2006 1037 3679 3978 1012 1000 9733 1521 1055 3319 2832 2005 4613 2951 4269 2043 24969 8005 1037 6721 1010 2235 16227 1997 8013 2376 5633 1998 10255 1996 5746 6764 102\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 1037 3522 9733 3105 14739 1010 6224 1037 3737 16375 3208 2005 24969 2951 2578 1999 14261 1010 5577 1996 2535 4286 2377 1024 1523 2296 2154 2016 1031 24969 1033 4952 2015 2000 5190 1997 2111 3331 2000 2014 2055 2367 7832 1998 2367 4155 1010 1998 2016 3791 2256 2393 2000 2191 3168 1997 2009 2035 1012 1000 1996 2215 4748 4247 1024 1523 2023 2003 2502 2951 8304 2066 2017 1521 2310 2196 2464 2009 1012 2057 1521 2128 4526 1010 28847 1010 12731 15172 1998 20253 6565 12450 1997 4613 2006 1037 3679 3978 1012 1000 9733 1521 1055 3319 2832 2005 4613 2951 4269 2043 24969 8005 1037 6721 1010 2235 16227 1997 8013 2376 5633 1998 10255 1996 5746 6764 102\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 1 (id = 1)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 1 (id = 1)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] new delhi : the bharatiya janata party manifesto monday promised to increase intake capacity of central engineering , management , science and law institutions by 50 % over the next five years if voted to power but remains silent on the impending new education policy . central institutions encompass indian institutes of technology ( ii ##ts ) , indian institute of management ( ii ##ms ) , all indian institutions of medical science ( ai ##im ##s ) , national law universities , indian institute of science technology and research ( ii ##ser ) among others . [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] new delhi : the bharatiya janata party manifesto monday promised to increase intake capacity of central engineering , management , science and law institutions by 50 % over the next five years if voted to power but remains silent on the impending new education policy . central institutions encompass indian institutes of technology ( ii ##ts ) , indian institute of management ( ii ##ms ) , all indian institutions of medical science ( ai ##im ##s ) , national law universities , indian institute of science technology and research ( ii ##ser ) among others . [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 2047 6768 1024 1996 24243 20308 2283 17124 6928 5763 2000 3623 13822 3977 1997 2430 3330 1010 2968 1010 2671 1998 2375 4896 2011 2753 1003 2058 1996 2279 2274 2086 2065 5444 2000 2373 2021 3464 4333 2006 1996 17945 2047 2495 3343 1012 2430 4896 25281 2796 12769 1997 2974 1006 2462 3215 1007 1010 2796 2820 1997 2968 1006 2462 5244 1007 1010 2035 2796 4896 1997 2966 2671 1006 9932 5714 2015 1007 1010 2120 2375 5534 1010 2796 2820 1997 2671 2974 1998 2470 1006 2462 8043 1007 2426 2500 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 2047 6768 1024 1996 24243 20308 2283 17124 6928 5763 2000 3623 13822 3977 1997 2430 3330 1010 2968 1010 2671 1998 2375 4896 2011 2753 1003 2058 1996 2279 2274 2086 2065 5444 2000 2373 2021 3464 4333 2006 1996 17945 2047 2495 3343 1012 2430 4896 25281 2796 12769 1997 2974 1006 2462 3215 1007 1010 2796 2820 1997 2968 1006 2462 5244 1007 1010 2035 2796 4896 1997 2966 2671 1006 9932 5714 2015 1007 1010 2120 2375 5534 1010 2796 2820 1997 2671 2974 1998 2470 1006 2462 8043 1007 2426 2500 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] facebook declined to comment , while what ##sa ##pp and ins ##tagram did not immediately respond to a request for comment . the panel has previously summoned social network twitter inc ' s chief executive jack dorsey to appear on monday to discuss the same topic . \" these are issues for all internet services globally , \" twitter said on friday , adding that colin crowe ##ll , its global vice president of public policy , is to meet the panel on monday . [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] facebook declined to comment , while what ##sa ##pp and ins ##tagram did not immediately respond to a request for comment . the panel has previously summoned social network twitter inc ' s chief executive jack dorsey to appear on monday to discuss the same topic . \" these are issues for all internet services globally , \" twitter said on friday , adding that colin crowe ##ll , its global vice president of public policy , is to meet the panel on monday . [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 9130 6430 2000 7615 1010 2096 2054 3736 9397 1998 16021 23091 2106 2025 3202 6869 2000 1037 5227 2005 7615 1012 1996 5997 2038 3130 11908 2591 2897 10474 4297 1005 1055 2708 3237 2990 27332 2000 3711 2006 6928 2000 6848 1996 2168 8476 1012 1000 2122 2024 3314 2005 2035 4274 2578 16452 1010 1000 10474 2056 2006 5958 1010 5815 2008 6972 25657 3363 1010 2049 3795 3580 2343 1997 2270 3343 1010 2003 2000 3113 1996 5997 2006 6928 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 9130 6430 2000 7615 1010 2096 2054 3736 9397 1998 16021 23091 2106 2025 3202 6869 2000 1037 5227 2005 7615 1012 1996 5997 2038 3130 11908 2591 2897 10474 4297 1005 1055 2708 3237 2990 27332 2000 3711 2006 6928 2000 6848 1996 2168 8476 1012 1000 2122 2024 3314 2005 2035 4274 2578 16452 1010 1000 10474 2056 2006 5958 1010 5815 2008 6972 25657 3363 1010 2049 3795 3580 2343 1997 2270 3343 1010 2003 2000 3113 1996 5997 2006 6928 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 1 (id = 1)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 1 (id = 1)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: None\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] “ in my home , i often communicate in both english and hindi . so i ’ m very excited about this feature that lets me use the assistant in a more natural way , \" said gupta . google india ##and ##roid goa ##nd ##roid go pie ##sam ##sun ##g galaxy j ##2 core ##no ##kia ##lav ##ami ##cr ##oma ##x ##go ##og ##le assistant ##go ##og ##le assistant marathi ##go ##og ##le go ##ma ##ps go [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] “ in my home , i often communicate in both english and hindi . so i ’ m very excited about this feature that lets me use the assistant in a more natural way , \" said gupta . google india ##and ##roid goa ##nd ##roid go pie ##sam ##sun ##g galaxy j ##2 core ##no ##kia ##lav ##ami ##cr ##oma ##x ##go ##og ##le assistant ##go ##og ##le assistant marathi ##go ##og ##le go ##ma ##ps go [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 1523 1999 2026 2188 1010 1045 2411 10639 1999 2119 2394 1998 9269 1012 2061 1045 1521 1049 2200 7568 2055 2023 3444 2008 11082 2033 2224 1996 3353 1999 1037 2062 3019 2126 1010 1000 2056 20512 1012 8224 2634 5685 22943 15244 4859 22943 2175 11345 21559 19729 2290 9088 1046 2475 4563 3630 21128 14973 10631 26775 9626 2595 3995 8649 2571 3353 3995 8649 2571 3353 18388 3995 8649 2571 2175 2863 4523 2175 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 1523 1999 2026 2188 1010 1045 2411 10639 1999 2119 2394 1998 9269 1012 2061 1045 1521 1049 2200 7568 2055 2023 3444 2008 11082 2033 2224 1996 3353 1999 1037 2062 3019 2126 1010 1000 2056 20512 1012 8224 2634 5685 22943 15244 4859 22943 2175 11345 21559 19729 2290 9088 1046 2475 4563 3630 21128 14973 10631 26775 9626 2595 3995 8649 2571 3353 3995 8649 2571 3353 18388 3995 8649 2571 2175 2863 4523 2175 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 1 (id = 1)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 1 (id = 1)\n"
],
"name": "stderr"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "WZEmm8KEUX3F",
"colab_type": "code",
"outputId": "f8fa6a4c-7282-4469-8b2e-7f5243139ddc",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 241
}
},
"source": [
"#Example on first observation in the training set\n",
"print(\"Sentence : \", train_InputExamples.iloc[0].text_a)\n",
"print(\"-\"*30)\n",
"print(\"Tokens : \", tokenizer.tokenize(train_InputExamples.iloc[0].text_a))\n",
"print(\"-\"*30)\n",
"print(\"Input IDs : \", train_features[0].input_ids)\n",
"print(\"-\"*30)\n",
"print(\"Input Masks : \", train_features[0].input_mask)\n",
"print(\"-\"*30)\n",
"print(\"Segment IDs : \", train_features[0].segment_ids)"
],
"execution_count": 21,
"outputs": [
{
"output_type": "stream",
"text": [
"Sentence : Oil prices extended its rally to a five-month high as conflict in Libya increased the risk of new supply outages\n",
"\n",
"\n",
"Indian rupee today weakened marginally against US dollar, tracking losses in other Asian currencies as traders awaited further details on a possible US-China trade deal. Higher crude oil prices also dampened sentiment. At 9.15 am, the rupee was trading at 69.46 a dollar, down 0.34% from its previous close of 69.23. The home currency opened at 69.34 a dollar.\n",
"------------------------------\n",
"Tokens : ['oil', 'prices', 'extended', 'its', 'rally', 'to', 'a', 'five', '-', 'month', 'high', 'as', 'conflict', 'in', 'libya', 'increased', 'the', 'risk', 'of', 'new', 'supply', 'out', '##ages', 'indian', 'ru', '##pee', 'today', 'weakened', 'marginal', '##ly', 'against', 'us', 'dollar', ',', 'tracking', 'losses', 'in', 'other', 'asian', 'cu', '##rre', '##ncies', 'as', 'traders', 'awaited', 'further', 'details', 'on', 'a', 'possible', 'us', '-', 'china', 'trade', 'deal', '.', 'higher', 'crude', 'oil', 'prices', 'also', 'damp', '##ened', 'sentiment', '.', 'at', '9', '.', '15', 'am', ',', 'the', 'ru', '##pee', 'was', 'trading', 'at', '69', '.', '46', 'a', 'dollar', ',', 'down', '0', '.', '34', '%', 'from', 'its', 'previous', 'close', 'of', '69', '.', '23', '.', 'the', 'home', 'currency', 'opened', 'at', '69', '.', '34', 'a', 'dollar', '.']\n",
"------------------------------\n",
"Input IDs : [101, 3514, 7597, 3668, 2049, 8320, 2000, 1037, 2274, 1011, 3204, 2152, 2004, 4736, 1999, 12917, 3445, 1996, 3891, 1997, 2047, 4425, 2041, 13923, 2796, 21766, 28084, 2651, 11855, 14785, 2135, 2114, 2149, 7922, 1010, 9651, 6409, 1999, 2060, 4004, 12731, 14343, 14767, 2004, 13066, 19605, 2582, 4751, 2006, 1037, 2825, 2149, 1011, 2859, 3119, 3066, 1012, 3020, 13587, 3514, 7597, 2036, 10620, 6675, 15792, 1012, 2012, 1023, 1012, 2321, 2572, 1010, 1996, 21766, 28084, 2001, 6202, 2012, 6353, 1012, 4805, 1037, 7922, 1010, 2091, 1014, 1012, 4090, 1003, 2013, 2049, 3025, 2485, 1997, 6353, 1012, 2603, 1012, 1996, 2188, 9598, 2441, 2012, 6353, 1012, 4090, 1037, 7922, 1012, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n",
"------------------------------\n",
"Input Masks : [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n",
"------------------------------\n",
"Segment IDs : [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ccp5trMwRtmr",
"colab_type": "text"
},
"source": [
"##Creating A Multi-Class Classifier Model\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "6o2a5ZIvRcJq",
"colab_type": "code",
"colab": {}
},
"source": [
"def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,\n",
" num_labels):\n",
" \n",
" bert_module = hub.Module(\n",
" BERT_MODEL_HUB,\n",
" trainable=True)\n",
" bert_inputs = dict(\n",
" input_ids=input_ids,\n",
" input_mask=input_mask,\n",
" segment_ids=segment_ids)\n",
" bert_outputs = bert_module(\n",
" inputs=bert_inputs,\n",
" signature=\"tokens\",\n",
" as_dict=True)\n",
"\n",
" # Use \"pooled_output\" for classification tasks on an entire sentence.\n",
" # Use \"sequence_outputs\" for token-level output.\n",
" output_layer = bert_outputs[\"pooled_output\"]\n",
"\n",
" hidden_size = output_layer.shape[-1].value\n",
"\n",
" # Create our own layer to tune for politeness data.\n",
" output_weights = tf.get_variable(\n",
" \"output_weights\", [num_labels, hidden_size],\n",
" initializer=tf.truncated_normal_initializer(stddev=0.02))\n",
"\n",
" output_bias = tf.get_variable(\n",
" \"output_bias\", [num_labels], initializer=tf.zeros_initializer())\n",
"\n",
" with tf.variable_scope(\"loss\"):\n",
"\n",
" # Dropout helps prevent overfitting\n",
" output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)\n",
"\n",
" logits = tf.matmul(output_layer, output_weights, transpose_b=True)\n",
" logits = tf.nn.bias_add(logits, output_bias)\n",
" log_probs = tf.nn.log_softmax(logits, axis=-1)\n",
"\n",
" # Convert labels into one-hot encoding\n",
" one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)\n",
"\n",
" predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))\n",
" # If we're predicting, we want predicted labels and the probabiltiies.\n",
" if is_predicting:\n",
" return (predicted_labels, log_probs)\n",
"\n",
" # If we're train/eval, compute loss between predicted and actual label\n",
" per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)\n",
" loss = tf.reduce_mean(per_example_loss)\n",
" return (loss, predicted_labels, log_probs)\n"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "FnH-AnOQ9KKW",
"colab_type": "code",
"colab": {}
},
"source": [
"#A function that adapts our model to work for training, evaluation, and prediction.\n",
"\n",
"# model_fn_builder actually creates our model function\n",
"# using the passed parameters for num_labels, learning_rate, etc.\n",
"def model_fn_builder(num_labels, learning_rate, num_train_steps,\n",
" num_warmup_steps):\n",
" \"\"\"Returns `model_fn` closure for TPUEstimator.\"\"\"\n",
" def model_fn(features, labels, mode, params): # pylint: disable=unused-argument\n",
" \"\"\"The `model_fn` for TPUEstimator.\"\"\"\n",
"\n",
" input_ids = features[\"input_ids\"]\n",
" input_mask = features[\"input_mask\"]\n",
" segment_ids = features[\"segment_ids\"]\n",
" label_ids = features[\"label_ids\"]\n",
"\n",
" is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)\n",
" \n",
" # TRAIN and EVAL\n",
" if not is_predicting:\n",
"\n",
" (loss, predicted_labels, log_probs) = create_model(\n",
" is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)\n",
"\n",
" train_op = bert.optimization.create_optimizer(\n",
" loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)\n",
"\n",
" # Calculate evaluation metrics. \n",
" def metric_fn(label_ids, predicted_labels):\n",
" accuracy = tf.metrics.accuracy(label_ids, predicted_labels)\n",
" true_pos = tf.metrics.true_positives(\n",
" label_ids,\n",
" predicted_labels)\n",
" true_neg = tf.metrics.true_negatives(\n",
" label_ids,\n",
" predicted_labels) \n",
" false_pos = tf.metrics.false_positives(\n",
" label_ids,\n",
" predicted_labels) \n",
" false_neg = tf.metrics.false_negatives(\n",
" label_ids,\n",
" predicted_labels)\n",
" \n",
" return {\n",
" \"eval_accuracy\": accuracy,\n",
" \"true_positives\": true_pos,\n",
" \"true_negatives\": true_neg,\n",
" \"false_positives\": false_pos,\n",
" \"false_negatives\": false_neg\n",
" }\n",
"\n",
" eval_metrics = metric_fn(label_ids, predicted_labels)\n",
"\n",
" if mode == tf.estimator.ModeKeys.TRAIN:\n",
" return tf.estimator.EstimatorSpec(mode=mode,\n",
" loss=loss,\n",
" train_op=train_op)\n",
" else:\n",
" return tf.estimator.EstimatorSpec(mode=mode,\n",
" loss=loss,\n",
" eval_metric_ops=eval_metrics)\n",
" else:\n",
" (predicted_labels, log_probs) = create_model(\n",
" is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)\n",
"\n",
" predictions = {\n",
" 'probabilities': log_probs,\n",
" 'labels': predicted_labels\n",
" }\n",
" return tf.estimator.EstimatorSpec(mode, predictions=predictions)\n",
"\n",
" # Return the actual model function in the closure\n",
" return model_fn\n"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "OjwJ4bTeWXD8",
"colab_type": "code",
"colab": {}
},
"source": [
"# Compute train and warmup steps from batch size\n",
"# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)\n",
"BATCH_SIZE = 32\n",
"LEARNING_RATE = 2e-5\n",
"NUM_TRAIN_EPOCHS = 3.0\n",
"# Warmup is a period of time where the learning rate is small and gradually increases--usually helps training.\n",
"WARMUP_PROPORTION = 0.1\n",
"# Model configs\n",
"SAVE_CHECKPOINTS_STEPS = 300\n",
"SAVE_SUMMARY_STEPS = 100\n",
"\n",
"# Compute train and warmup steps from batch size\n",
"num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)\n",
"num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)\n",
"\n",
"# Specify output directory and number of checkpoint steps to save\n",
"run_config = tf.estimator.RunConfig(\n",
" model_dir=OUTPUT_DIR,\n",
" save_summary_steps=SAVE_SUMMARY_STEPS,\n",
" save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)\n",
"\n",
"# Specify output directory and number of checkpoint steps to save\n",
"run_config = tf.estimator.RunConfig(\n",
" model_dir=OUTPUT_DIR,\n",
" save_summary_steps=SAVE_SUMMARY_STEPS,\n",
" save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "q_WebpS1X97v",
"colab_type": "code",
"outputId": "917d1322-1812-42d6-901c-46798a121652",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 275
}
},
"source": [
"#Initializing the model and the estimator\n",
"model_fn = model_fn_builder(\n",
" num_labels=len(label_list),\n",
" learning_rate=LEARNING_RATE,\n",
" num_train_steps=num_train_steps,\n",
" num_warmup_steps=num_warmup_steps)\n",
"\n",
"estimator = tf.estimator.Estimator(\n",
" model_fn=model_fn,\n",
" config=run_config,\n",
" params={\"batch_size\": BATCH_SIZE})\n"
],
"execution_count": 25,
"outputs": [
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Using config: {'_model_dir': '/GD/My Drive/Colab Notebooks/BERT/bert_news_category', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 300, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true\n",
"graph_options {\n",
" rewrite_options {\n",
" meta_optimizer_iterations: ONE\n",
" }\n",
"}\n",
", '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7ff5448672b0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Using config: {'_model_dir': '/GD/My Drive/Colab Notebooks/BERT/bert_news_category', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 300, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true\n",
"graph_options {\n",
" rewrite_options {\n",
" meta_optimizer_iterations: ONE\n",
" }\n",
"}\n",
", '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7ff5448672b0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}\n"
],
"name": "stderr"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "NOO3RfG1DYLo",
"colab_type": "text"
},
"source": [
"we will now create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators)."
]
},
{
"cell_type": "code",
"metadata": {
"id": "1Pv2bAlOX_-K",
"colab_type": "code",
"colab": {}
},
"source": [
"# Create an input function for training. drop_remainder = True for using TPUs.\n",
"train_input_fn = bert.run_classifier.input_fn_builder(\n",
" features=train_features,\n",
" seq_length=MAX_SEQ_LENGTH,\n",
" is_training=True,\n",
" drop_remainder=False)\n",
"\n",
"# Create an input function for validating. drop_remainder = True for using TPUs.\n",
"val_input_fn = run_classifier.input_fn_builder(\n",
" features=val_features,\n",
" seq_length=MAX_SEQ_LENGTH,\n",
" is_training=False,\n",
" drop_remainder=False)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "_vrumsg9uygH",
"colab_type": "text"
},
"source": [
"##Training & Evaluating"
]
},
{
"cell_type": "code",
"metadata": {
"id": "nucD4gluYJmK",
"colab_type": "code",
"outputId": "7793bdd3-8be5-4f9c-9c70-de03397186ad",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
}
},
"source": [
"#Training the model\n",
"print(f'Beginning Training!')\n",
"current_time = datetime.now()\n",
"estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)\n",
"print(\"Training took time \", datetime.now() - current_time)"
],
"execution_count": 28,
"outputs": [
{
"output_type": "stream",
"text": [
"Beginning Training!\n",
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Calling model_fn.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Calling model_fn.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saver not created because there are no variables in the graph to restore\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saver not created because there are no variables in the graph to restore\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From <ipython-input-22-bdfb628bf45b>:33: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From <ipython-input-22-bdfb628bf45b>:33: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/bert/optimization.py:27: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.\n",
"\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/bert/optimization.py:27: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.\n",
"\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/bert/optimization.py:32: The name tf.train.polynomial_decay is deprecated. Please use tf.compat.v1.train.polynomial_decay instead.\n",
"\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/bert/optimization.py:32: The name tf.train.polynomial_decay is deprecated. Please use tf.compat.v1.train.polynomial_decay instead.\n",
"\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/bert/optimization.py:70: The name tf.trainable_variables is deprecated. Please use tf.compat.v1.trainable_variables instead.\n",
"\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/bert/optimization.py:70: The name tf.trainable_variables is deprecated. Please use tf.compat.v1.trainable_variables instead.\n",
"\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/math_grad.py:1375: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Use tf.where in 2.0, which has the same broadcast rule as np.where\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/math_grad.py:1375: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.\n",
"Instructions for updating:\n",
"Use tf.where in 2.0, which has the same broadcast rule as np.where\n",
"/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/indexed_slices.py:424: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.\n",
" \"Converting sparse IndexedSlices to a dense Tensor of unknown shape. \"\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done calling model_fn.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done calling model_fn.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Create CheckpointSaverHook.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Create CheckpointSaverHook.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Graph was finalized.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Graph was finalized.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Running local_init_op.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Running local_init_op.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done running local_init_op.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done running local_init_op.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saving checkpoints for 0 into /GD/My Drive/Colab Notebooks/BERT/bert_news_category/model.ckpt.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saving checkpoints for 0 into /GD/My Drive/Colab Notebooks/BERT/bert_news_category/model.ckpt.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:loss = 1.671329, step = 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:loss = 1.671329, step = 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:global_step/sec: 0.579778\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:global_step/sec: 0.579778\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:loss = 0.02809212, step = 100 (172.482 sec)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:loss = 0.02809212, step = 100 (172.482 sec)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:global_step/sec: 0.640514\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:global_step/sec: 0.640514\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:loss = 0.0046886336, step = 200 (156.124 sec)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:loss = 0.0046886336, step = 200 (156.124 sec)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saving checkpoints for 300 into /GD/My Drive/Colab Notebooks/BERT/bert_news_category/model.ckpt.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saving checkpoints for 300 into /GD/My Drive/Colab Notebooks/BERT/bert_news_category/model.ckpt.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:global_step/sec: 0.614066\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:global_step/sec: 0.614066\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:loss = 0.0064441273, step = 300 (162.853 sec)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:loss = 0.0064441273, step = 300 (162.853 sec)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:global_step/sec: 0.640734\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:global_step/sec: 0.640734\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:loss = 0.0016644242, step = 400 (156.068 sec)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:loss = 0.0016644242, step = 400 (156.068 sec)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:global_step/sec: 0.639597\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:global_step/sec: 0.639597\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:loss = 0.0014373187, step = 500 (156.353 sec)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:loss = 0.0014373187, step = 500 (156.353 sec)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saving checkpoints for 572 into /GD/My Drive/Colab Notebooks/BERT/bert_news_category/model.ckpt.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saving checkpoints for 572 into /GD/My Drive/Colab Notebooks/BERT/bert_news_category/model.ckpt.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Loss for final step: 0.013614067.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Loss for final step: 0.013614067.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"Training took time 0:16:15.354629\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "PPVEXhNjYXC-",
"colab_type": "code",
"outputId": "23deac72-c825-46a9-835b-721bbdef57e8",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 564
}
},
"source": [
"#Evaluating the model with Validation set\n",
"estimator.evaluate(input_fn=val_input_fn, steps=None)"
],
"execution_count": 30,
"outputs": [
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Calling model_fn.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Calling model_fn.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saver not created because there are no variables in the graph to restore\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saver not created because there are no variables in the graph to restore\n",
"/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/indexed_slices.py:424: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.\n",
" \"Converting sparse IndexedSlices to a dense Tensor of unknown shape. \"\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done calling model_fn.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done calling model_fn.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Starting evaluation at 2019-11-14T11:32:13Z\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Starting evaluation at 2019-11-14T11:32:13Z\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Graph was finalized.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Graph was finalized.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Restoring parameters from /GD/My Drive/Colab Notebooks/BERT/bert_news_category/model.ckpt-572\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Restoring parameters from /GD/My Drive/Colab Notebooks/BERT/bert_news_category/model.ckpt-572\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Running local_init_op.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Running local_init_op.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done running local_init_op.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done running local_init_op.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Finished evaluation at 2019-11-14-11:32:46\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Finished evaluation at 2019-11-14-11:32:46\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saving dict for global step 572: eval_accuracy = 0.98689383, false_negatives = 8.0, false_positives = 7.0, global_step = 572, loss = 0.060107384, true_negatives = 338.0, true_positives = 1173.0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saving dict for global step 572: eval_accuracy = 0.98689383, false_negatives = 8.0, false_positives = 7.0, global_step = 572, loss = 0.060107384, true_negatives = 338.0, true_positives = 1173.0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saving 'checkpoint_path' summary for global step 572: /GD/My Drive/Colab Notebooks/BERT/bert_news_category/model.ckpt-572\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saving 'checkpoint_path' summary for global step 572: /GD/My Drive/Colab Notebooks/BERT/bert_news_category/model.ckpt-572\n"
],
"name": "stderr"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'eval_accuracy': 0.98689383,\n",
" 'false_negatives': 8.0,\n",
" 'false_positives': 7.0,\n",
" 'global_step': 572,\n",
" 'loss': 0.060107384,\n",
" 'true_negatives': 338.0,\n",
" 'true_positives': 1173.0}"
]
},
"metadata": {
"tags": []
},
"execution_count": 30
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dTF8Om8f7S7e",
"colab_type": "text"
},
"source": [
"##Vola !! We got an evaluation accuracy of 98% on the validation set by just having trained the model for 3 epochs and a few hundred steps."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_6f9Rlhrupt6",
"colab_type": "text"
},
"source": [
"##Predicting For Test Set"
]
},
{
"cell_type": "code",
"metadata": {
"id": "OsrbTD2EJTVl",
"colab_type": "code",
"colab": {}
},
"source": [
"\"\"\"Politics: 0\n",
"Technology: 1\n",
"Entertainment: 2\n",
"Business: 3\"\"\"\n",
"\n",
"# A method to get predictions\n",
"def getPrediction(in_sentences):\n",
" #A list to map the actual labels to the predictions\n",
" labels = [\"Politics\", \"Technology\",\"Entertainment\",\"Business\"]\n",
"\n",
" #Transforming the test data into BERT accepted form\n",
" input_examples = [run_classifier.InputExample(guid=\"\", text_a = x, text_b = None, label = 0) for x in in_sentences] \n",
" \n",
" #Creating input features for Test data\n",
" input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)\n",
"\n",
" #Predicting the classes \n",
" predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)\n",
" predictions = estimator.predict(predict_input_fn)\n",
" return [(sentence, prediction['probabilities'],prediction['labels'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "-thbodgih_VJ",
"colab_type": "code",
"colab": {}
},
"source": [
"pred_sentences = list(test['STORY'])"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "QrZmvZySKQTm",
"colab_type": "code",
"outputId": "e8e4a6e0-7b38-4a6a-f411-88ad01e5f7f1",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
}
},
"source": [
"predictions = getPrediction(pred_sentences)"
],
"execution_count": 34,
"outputs": [
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Writing example 0 of 2748\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Writing example 0 of 2748\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] 2019 will see ga ##dgets like gaming smartphone ##s and wear ##able medical devices lifting the user experience to a whole new level mint - india - wire consumer technology ##con ##sume ##r technology trends in new year ##tech ga ##dgets ##fold ##able phones ##gami ##ng smartphone ##sw ##ear ##able medical devices ##tech ##nology new delhi : ga ##dgets have become an integral part of our lives with most of us relying on some form of factor to communicate , com ##mute , work , be informed or entertained . year 2019 will see some ga ##dgets lifting the user experience to a whole new level . here ’ s what we can expect to see : smartphone ##s with fold ##able screens : fold ##able [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] 2019 will see ga ##dgets like gaming smartphone ##s and wear ##able medical devices lifting the user experience to a whole new level mint - india - wire consumer technology ##con ##sume ##r technology trends in new year ##tech ga ##dgets ##fold ##able phones ##gami ##ng smartphone ##sw ##ear ##able medical devices ##tech ##nology new delhi : ga ##dgets have become an integral part of our lives with most of us relying on some form of factor to communicate , com ##mute , work , be informed or entertained . year 2019 will see some ga ##dgets lifting the user experience to a whole new level . here ’ s what we can expect to see : smartphone ##s with fold ##able screens : fold ##able [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 10476 2097 2156 11721 28682 2066 10355 26381 2015 1998 4929 3085 2966 5733 8783 1996 5310 3325 2000 1037 2878 2047 2504 12927 1011 2634 1011 7318 7325 2974 8663 23545 2099 2974 12878 1999 2047 2095 15007 11721 28682 10371 3085 11640 26517 3070 26381 26760 14644 3085 2966 5733 15007 21020 2047 6768 1024 11721 28682 2031 2468 2019 9897 2112 1997 2256 3268 2007 2087 1997 2149 18345 2006 2070 2433 1997 5387 2000 10639 1010 4012 26746 1010 2147 1010 2022 6727 2030 21474 1012 2095 10476 2097 2156 2070 11721 28682 8783 1996 5310 3325 2000 1037 2878 2047 2504 1012 2182 1521 1055 2054 2057 2064 5987 2000 2156 1024 26381 2015 2007 10671 3085 12117 1024 10671 3085 102\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 10476 2097 2156 11721 28682 2066 10355 26381 2015 1998 4929 3085 2966 5733 8783 1996 5310 3325 2000 1037 2878 2047 2504 12927 1011 2634 1011 7318 7325 2974 8663 23545 2099 2974 12878 1999 2047 2095 15007 11721 28682 10371 3085 11640 26517 3070 26381 26760 14644 3085 2966 5733 15007 21020 2047 6768 1024 11721 28682 2031 2468 2019 9897 2112 1997 2256 3268 2007 2087 1997 2149 18345 2006 2070 2433 1997 5387 2000 10639 1010 4012 26746 1010 2147 1010 2022 6727 2030 21474 1012 2095 10476 2097 2156 2070 11721 28682 8783 1996 5310 3325 2000 1037 2878 2047 2504 1012 2182 1521 1055 2054 2057 2064 5987 2000 2156 1024 26381 2015 2007 10671 3085 12117 1024 10671 3085 102\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] it has also unleashed a wave of changes in the mc ##u that will make sure its future is a lot different than its past kevin fei ##ge had signal ##led diversity and more representation in the post - phase 3 mc ##u and end ##game does a lot to showcase the initiative [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] it has also unleashed a wave of changes in the mc ##u that will make sure its future is a lot different than its past kevin fei ##ge had signal ##led diversity and more representation in the post - phase 3 mc ##u and end ##game does a lot to showcase the initiative [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 2009 2038 2036 22416 1037 4400 1997 3431 1999 1996 11338 2226 2008 2097 2191 2469 2049 2925 2003 1037 2843 2367 2084 2049 2627 4901 24664 3351 2018 4742 3709 8906 1998 2062 6630 1999 1996 2695 1011 4403 1017 11338 2226 1998 2203 16650 2515 1037 2843 2000 13398 1996 6349 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 2009 2038 2036 22416 1037 4400 1997 3431 1999 1996 11338 2226 2008 2097 2191 2469 2049 2925 2003 1037 2843 2367 2084 2049 2627 4901 24664 3351 2018 4742 3709 8906 1998 2062 6630 1999 1996 2695 1011 4403 1017 11338 2226 1998 2203 16650 2515 1037 2843 2000 13398 1996 6349 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] it can be confusing to pick the right smartphone for yourself , so we have segregated the top smartphone ##s under rs 20 , 000 according to their strengths . the best smartphone ##s under ₹ ##20 , 000 cat ##ego ##rise ##d according to performance , camera , design and battery life mint - india - wire phones under rs 2000 ##0 ##po ##co f1 ##real ##me u ##1 ##red ##mi note 6 pro ##real ##me 2 pro ##hon ##or play ##no ##kia 7 . 1 ##nova 3 ##ias ##us zen ##fo ##ne max pro m1 gone are the days when you had to shell out big buck for buying smartphone ##s with premium features . technology has become more accessible recently and the biggest [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] it can be confusing to pick the right smartphone for yourself , so we have segregated the top smartphone ##s under rs 20 , 000 according to their strengths . the best smartphone ##s under ₹ ##20 , 000 cat ##ego ##rise ##d according to performance , camera , design and battery life mint - india - wire phones under rs 2000 ##0 ##po ##co f1 ##real ##me u ##1 ##red ##mi note 6 pro ##real ##me 2 pro ##hon ##or play ##no ##kia 7 . 1 ##nova 3 ##ias ##us zen ##fo ##ne max pro m1 gone are the days when you had to shell out big buck for buying smartphone ##s with premium features . technology has become more accessible recently and the biggest [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 2009 2064 2022 16801 2000 4060 1996 2157 26381 2005 4426 1010 2061 2057 2031 24382 1996 2327 26381 2015 2104 12667 2322 1010 2199 2429 2000 2037 20828 1012 1996 2190 26381 2015 2104 1576 11387 1010 2199 4937 20265 29346 2094 2429 2000 2836 1010 4950 1010 2640 1998 6046 2166 12927 1011 2634 1011 7318 11640 2104 12667 2456 2692 6873 3597 20069 22852 4168 1057 2487 5596 4328 3602 1020 4013 22852 4168 1016 4013 8747 2953 2377 3630 21128 1021 1012 1015 13455 1017 7951 2271 16729 14876 2638 4098 4013 23290 2908 2024 1996 2420 2043 2017 2018 2000 5806 2041 2502 10131 2005 9343 26381 2015 2007 12882 2838 1012 2974 2038 2468 2062 7801 3728 1998 1996 5221 102\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 2009 2064 2022 16801 2000 4060 1996 2157 26381 2005 4426 1010 2061 2057 2031 24382 1996 2327 26381 2015 2104 12667 2322 1010 2199 2429 2000 2037 20828 1012 1996 2190 26381 2015 2104 1576 11387 1010 2199 4937 20265 29346 2094 2429 2000 2836 1010 4950 1010 2640 1998 6046 2166 12927 1011 2634 1011 7318 11640 2104 12667 2456 2692 6873 3597 20069 22852 4168 1057 2487 5596 4328 3602 1020 4013 22852 4168 1016 4013 8747 2953 2377 3630 21128 1021 1012 1015 13455 1017 7951 2271 16729 14876 2638 4098 4013 23290 2908 2024 1996 2420 2043 2017 2018 2000 5806 2041 2502 10131 2005 9343 26381 2015 2007 12882 2838 1012 2974 2038 2468 2062 7801 3728 1998 1996 5221 102\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] the mobile application is integrated with a dashboard to confirm and register the pre - registered cases , to enable online interface between the ben ##ef ##icia ##ry and the panel lawyer through video con ##fer ##en ##cing and telephone facility . prasad said that a pilot project in this regard had proved useful and more than 50 , 000 people have already avail ##ed this service . till january , it resulted in enabling legal advice to 49 , 192 ben ##ef ##icia ##ries that include 36 , 52 ##6 ( women ) , 70 ##49 ( sc ) and 139 ##70 ( st ) in 11 states including uttar pradesh , bihar and all north - eastern states and the state of jammu and [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] the mobile application is integrated with a dashboard to confirm and register the pre - registered cases , to enable online interface between the ben ##ef ##icia ##ry and the panel lawyer through video con ##fer ##en ##cing and telephone facility . prasad said that a pilot project in this regard had proved useful and more than 50 , 000 people have already avail ##ed this service . till january , it resulted in enabling legal advice to 49 , 192 ben ##ef ##icia ##ries that include 36 , 52 ##6 ( women ) , 70 ##49 ( sc ) and 139 ##70 ( st ) in 11 states including uttar pradesh , bihar and all north - eastern states and the state of jammu and [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 1996 4684 4646 2003 6377 2007 1037 24923 2000 12210 1998 4236 1996 3653 1011 5068 3572 1010 2000 9585 3784 8278 2090 1996 3841 12879 24108 2854 1998 1996 5997 5160 2083 2678 9530 7512 2368 6129 1998 7026 4322 1012 17476 2056 2008 1037 4405 2622 1999 2023 7634 2018 4928 6179 1998 2062 2084 2753 1010 2199 2111 2031 2525 24608 2098 2023 2326 1012 6229 2254 1010 2009 4504 1999 12067 3423 6040 2000 4749 1010 17613 3841 12879 24108 5134 2008 2421 4029 1010 4720 2575 1006 2308 1007 1010 3963 26224 1006 8040 1007 1998 16621 19841 1006 2358 1007 1999 2340 2163 2164 14940 7970 1010 16178 1998 2035 2167 1011 2789 2163 1998 1996 2110 1997 21433 1998 102\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 1996 4684 4646 2003 6377 2007 1037 24923 2000 12210 1998 4236 1996 3653 1011 5068 3572 1010 2000 9585 3784 8278 2090 1996 3841 12879 24108 2854 1998 1996 5997 5160 2083 2678 9530 7512 2368 6129 1998 7026 4322 1012 17476 2056 2008 1037 4405 2622 1999 2023 7634 2018 4928 6179 1998 2062 2084 2753 1010 2199 2111 2031 2525 24608 2098 2023 2326 1012 6229 2254 1010 2009 4504 1999 12067 3423 6040 2000 4749 1010 17613 3841 12879 24108 5134 2008 2421 4029 1010 4720 2575 1006 2308 1007 1010 3963 26224 1006 8040 1007 1998 16621 19841 1006 2358 1007 1999 2340 2163 2164 14940 7970 1010 16178 1998 2035 2167 1011 2789 2163 1998 1996 2110 1997 21433 1998 102\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] we have rounded up some of the ga ##dgets that showed up in 2018 and left an ind ##eli ##ble mark on , consumers , experts and the tech industry young ##sters playing pub ##g mobile on their smartphone for hours , elderly switching off the lights using voice or a family watching their favourite movie in 4 ##k hd ##r on netflix are some of the habits which were shaped by the ga ##dgets around them . we have rounded up some of the ga ##dgets that showed up in 2018 and left an ind ##eli ##ble mark on , consumers , experts and the tech industry . the echo plus 2 takes the whole io ##t experience up by a few notch ##es with [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] we have rounded up some of the ga ##dgets that showed up in 2018 and left an ind ##eli ##ble mark on , consumers , experts and the tech industry young ##sters playing pub ##g mobile on their smartphone for hours , elderly switching off the lights using voice or a family watching their favourite movie in 4 ##k hd ##r on netflix are some of the habits which were shaped by the ga ##dgets around them . we have rounded up some of the ga ##dgets that showed up in 2018 and left an ind ##eli ##ble mark on , consumers , experts and the tech industry . the echo plus 2 takes the whole io ##t experience up by a few notch ##es with [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 2057 2031 8352 2039 2070 1997 1996 11721 28682 2008 3662 2039 1999 2760 1998 2187 2019 27427 20806 3468 2928 2006 1010 10390 1010 8519 1998 1996 6627 3068 2402 15608 2652 9047 2290 4684 2006 2037 26381 2005 2847 1010 9750 11991 2125 1996 4597 2478 2376 2030 1037 2155 3666 2037 8837 3185 1999 1018 2243 10751 2099 2006 20907 2024 2070 1997 1996 14243 2029 2020 5044 2011 1996 11721 28682 2105 2068 1012 2057 2031 8352 2039 2070 1997 1996 11721 28682 2008 3662 2039 1999 2760 1998 2187 2019 27427 20806 3468 2928 2006 1010 10390 1010 8519 1998 1996 6627 3068 1012 1996 9052 4606 1016 3138 1996 2878 22834 2102 3325 2039 2011 1037 2261 18624 2229 2007 102\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 2057 2031 8352 2039 2070 1997 1996 11721 28682 2008 3662 2039 1999 2760 1998 2187 2019 27427 20806 3468 2928 2006 1010 10390 1010 8519 1998 1996 6627 3068 2402 15608 2652 9047 2290 4684 2006 2037 26381 2005 2847 1010 9750 11991 2125 1996 4597 2478 2376 2030 1037 2155 3666 2037 8837 3185 1999 1018 2243 10751 2099 2006 20907 2024 2070 1997 1996 14243 2029 2020 5044 2011 1996 11721 28682 2105 2068 1012 2057 2031 8352 2039 2070 1997 1996 11721 28682 2008 3662 2039 1999 2760 1998 2187 2019 27427 20806 3468 2928 2006 1010 10390 1010 8519 1998 1996 6627 3068 1012 1996 9052 4606 1016 3138 1996 2878 22834 2102 3325 2039 2011 1037 2261 18624 2229 2007 102\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Calling model_fn.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Calling model_fn.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saver not created because there are no variables in the graph to restore\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saver not created because there are no variables in the graph to restore\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done calling model_fn.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done calling model_fn.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Graph was finalized.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Graph was finalized.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Restoring parameters from /GD/My Drive/Colab Notebooks/BERT/bert_news_category/model.ckpt-572\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Restoring parameters from /GD/My Drive/Colab Notebooks/BERT/bert_news_category/model.ckpt-572\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Running local_init_op.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Running local_init_op.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done running local_init_op.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done running local_init_op.\n"
],
"name": "stderr"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "dvWtkufBoCyp",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 122
},
"outputId": "bc28fe71-0a68-4d91-83bb-0805715d1fa3"
},
"source": [
"predictions[0]"
],
"execution_count": 54,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"('2019 will see gadgets like gaming smartphones and wearable medical devices lifting the user experience to a whole new level\\n\\n\\nmint-india-wire consumer technologyconsumer technology trends in New Yeartech gadgetsFoldable phonesgaming smartphoneswearable medical devicestechnology\\n\\n\\nNew Delhi: Gadgets have become an integral part of our lives with most of us relying on some form of factor to communicate, commute, work, be informed or entertained. Year 2019 will see some gadgets lifting the user experience to a whole new level. Here’s what we can expect to see:\\n\\n\\nSmartphones with foldable screens: Foldable phones are finally moving from the concept stage to commercial launches. They are made up of organic light-emitting diode (OLED) panels with higher plastic substrates, allowing them to be bent without damage.\\n\\n\\nUS-based display maker Royole Corp’s foldable phone, FlexPai, has already arrived in select markets, while Samsung’s unnamed foldable phone is expected sometime next year. Samsung’s smartphone chief executive officer D.J. Koh has said they will make a million units of it. LG, too, is expected to display a foldable phone next year. Meanwhile Apple, Nokia, Lenovo and Huawei have also been working on foldable phones, reportedly.\\n\\n\\neSIM: Very soon your smartphone won’t need a physical SIM card anymore. The eSIM technology, already used by Apple in its iPhones and Apple Watch, replaces the physical SIM with a virtually embedded chip on the motherboard. eSIMs support multiple mobile operators and can be programmed to switch services.',\n",
" array([-7.9845862e+00, -9.2094438e-04, -8.2501354e+00, -8.0514145e+00],\n",
" dtype=float32),\n",
" 1,\n",
" 'Technology')"
]
},
"metadata": {
"tags": []
},
"execution_count": 54
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "ERkTE8-7oQLZ",
"colab_type": "code",
"colab": {}
},
"source": [
"enc_labels = []\n",
"act_labels = []\n",
"for i in range(len(predictions)):\n",
" enc_labels.append(predictions[i][2])\n",
" act_labels.append(predictions[i][3])"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "6ADAUcUuwo1I",
"colab_type": "code",
"colab": {}
},
"source": [
"pd.DataFrame(enc_labels, columns = ['SECTION']).to_excel('/GD/My Drive/Colab Notebooks/BERT/submission_bert.xlsx', index = False)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "2mRe3Jrlt8RV",
"colab_type": "text"
},
"source": [
"## Random Tester"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Y_yYnS-nt6ZU",
"colab_type": "code",
"outputId": "42650800-9a21-43ad-d6ca-0f59b9a75f57",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
}
},
"source": [
"#Classifying random sentences\n",
"tests = getPrediction(['Mr.Modi is the Indian Prime Minister',\n",
" 'Gaming machines are powered by efficient micro processores and GPUs',\n",
" 'That HBO TV series is really good',\n",
" 'A trillion dollar economy '\n",
" ])"
],
"execution_count": 52,
"outputs": [
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Writing example 0 of 4\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Writing example 0 of 4\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] mr . mod ##i is the indian prime minister [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] mr . mod ##i is the indian prime minister [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 2720 1012 16913 2072 2003 1996 2796 3539 2704 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 2720 1012 16913 2072 2003 1996 2796 3539 2704 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] gaming machines are powered by efficient micro processor ##es and gp ##us [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] gaming machines are powered by efficient micro processor ##es and gp ##us [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 10355 6681 2024 6113 2011 8114 12702 13151 2229 1998 14246 2271 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 10355 6681 2024 6113 2011 8114 12702 13151 2229 1998 14246 2271 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] that hbo tv series is really good [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] that hbo tv series is really good [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 2008 14633 2694 2186 2003 2428 2204 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 2008 14633 2694 2186 2003 2428 2204 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:*** Example ***\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:guid: \n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] a trillion dollar economy [SEP]\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:tokens: [CLS] a trillion dollar economy [SEP]\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 1037 23458 7922 4610 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_ids: 101 1037 23458 7922 4610 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:input_mask: 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:label: 0 (id = 0)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Calling model_fn.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Calling model_fn.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saver not created because there are no variables in the graph to restore\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Saver not created because there are no variables in the graph to restore\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done calling model_fn.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done calling model_fn.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Graph was finalized.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Graph was finalized.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Restoring parameters from /GD/My Drive/Colab Notebooks/BERT/bert_news_category/model.ckpt-572\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Restoring parameters from /GD/My Drive/Colab Notebooks/BERT/bert_news_category/model.ckpt-572\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Running local_init_op.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Running local_init_op.\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done running local_init_op.\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"INFO:tensorflow:Done running local_init_op.\n"
],
"name": "stderr"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "bjFLQTqAt6WG",
"colab_type": "code",
"outputId": "1a547b8e-5460-472e-c96f-2501122ce99d",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 323
}
},
"source": [
"tests"
],
"execution_count": 53,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"[('Mr.Modi is the Indian Prime Minister',\n",
" array([-0.10646384, -3.2651784 , -3.1611662 , -3.8909485 ], dtype=float32),\n",
" 0,\n",
" 'Politics'),\n",
" ('Gaming machines are powered by efficient micro processores and GPUs',\n",
" array([-8.072256e+00, -9.102254e-04, -8.540870e+00, -7.817997e+00],\n",
" dtype=float32),\n",
" 1,\n",
" 'Technology'),\n",
" ('That HBO TV series is really good',\n",
" array([-6.930523e+00, -6.092480e+00, -4.234040e-03, -6.920085e+00],\n",
" dtype=float32),\n",
" 2,\n",
" 'Entertainment'),\n",
" ('A trillion dollar economy ',\n",
" array([-2.1094828, -2.2138534, -3.7176704, -0.2941964], dtype=float32),\n",
" 3,\n",
" 'Business')]"
]
},
"metadata": {
"tags": []
},
"execution_count": 53
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Qi5MqgDRhZno",
"colab_type": "text"
},
"source": [
"#Reference:\n",
"Most of the code has been taken from the following resource:\n",
"\n",
"* https://colab.research.google.com/github/google-research/bert/blob/master/predicting_movie_reviews_with_bert_on_tf_hub.ipynb\n",
"\n"
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment