I've been trying to understand how to setup systems from
the ground up on Ubuntu. I just installed redis
onto
the box and here's how I did it and some things to look
out for.
To install:
Producer | |
Setup | |
bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test-rep-one --partitions 6 --replication-factor 1 | |
bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test --partitions 6 --replication-factor 3 | |
Single thread, no replication | |
bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 50000000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196 |
This list is based on aliases_spec.rb.
You can see also Module: RSpec::Matchers API.
matcher | aliased to | description |
---|---|---|
a_truthy_value | be_truthy | a truthy value |
a_falsey_value | be_falsey | a falsey value |
be_falsy | be_falsey | be falsy |
a_falsy_value | be_falsey | a falsy value |
$ uname -r
#cloud-config | |
# Option 1 - Full installation using cURL | |
package_update: true | |
package_upgrade: true | |
groups: | |
- docker | |
system_info: |
This serves as a quick reference and showcase of GitHub Flavored Markdown. For more complete info, see John Gruber's original spec and the Github-flavored Markdown info page.
smiling mouth revealing white straight teeth - 24426 | |
anxious expression with biting lower lip - 17012 | |
shallow depth of field - 16806 | |
early childhood age - 14067 | |
social worker - 12566 | |
smiling mouth revealing slightly crooked teeth - 12329 | |
broad grin revealing straight white teeth - 11336 | |
pediatrician - 11212 | |
preschooler age - 10873 | |
headshot - 10462 |
aboriginal | |
above average | |
abstract composition | |
abusive | |
accessories | |
accountant | |
acid wash | |
acne-prone skin | |
acne scars |
tl;dr this demo shows how to call OpenAI's gpt-4o-mini model, provide it with URL of a screenshot of a document, and extract data that follows a schema you define. The results are pretty solid even with little effort in defining the data — and no effort doing data prep. OpenAI's API could be a cost-efficient tool for large scale data gathering projects involving public documents.
OpenAI announced Structured Outputs for its API, a feature that allows users to specify the fields and schema of extracted data, and guarantees that the JSON output will follow that specification.
For example, given a Congressional financial disclosure report, with assets defined in a table like this: