Who? needs some information to be able to index news:
- news title
- creation date in country timezone
- URL to original article page
There are some optional information that could be benefecial for increasing Named Entity Detection such as:
- news body text
And other information that could be useful for future features such as:
- Article page views on the original site
- Article number of shares
There are several solutions for integeration, the implementation of any of them
can lead to a better visibility on Who?
:
-
RSS Feed:
Modifying the existing RSS feed url to accept unix timestamp url param named
before
and return news articles older than that timestamp ordered newest first. That will enableWho?
to traverse news articles from present to far past indexing all newspaper content exposed to the RSS feed. -
JSON API Creating an equivilant API endpoint with the following specs:
-
Method: GET
-
URL Params: before -> unix timestamp
-
Response type: JSON
-
Response format in http://jsonapi.org/,
-
entities required "Article"
-
attributes: [title, body, url, created_at, views, shares]
-
links: [next, previous] for paging