Skip to content

Instantly share code, notes, and snippets.

@jenzopr
Created November 21, 2014 15:02
Show Gist options
  • Save jenzopr/9edde53122554729c852 to your computer and use it in GitHub Desktop.
Save jenzopr/9edde53122554729c852 to your computer and use it in GitHub Desktop.
Typetest profile for cassandra-stress to produce IndexOutOfBounds-Exception
#
# This is an example YAML profile for cassandra-stress
#
# insert data
# cassandra-stress user profile=/home/jake/stress1.yaml ops(insert=1)
#
# read, using query simple1:
# cassandra-stress profile=/home/jake/stress1.yaml ops(simple1=1)
#
# mixed workload (90/10)
# cassandra-stress user profile=/home/jake/stress1.yaml ops(insert=1,simple1=9)
#
# Keyspace info
#
keyspace: stresscql
#
# The CQL for creating a keyspace (optional if it already exists)
#
keyspace_definition: |
CREATE KEYSPACE stresscql WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
#
# Table info
#
table: typestest
#
# The CQL for creating a table you wish to stress (optional if it already exists)
#
table_definition: |
CREATE TABLE typestest (
name text,
choice boolean,
date timestamp,
address inet,
dbl double,
lval bigint,
ival int,
uid timeuuid,
value blob,
PRIMARY KEY((name,choice), date, address, dbl, lval, ival, uid)
)
WITH compaction = { 'class':'LeveledCompactionStrategy' }
# AND compression = { 'sstable_compression' : '' }
# AND comment='A table of many types to test wide rows'
#
# Optional meta information on the generated columns in the above table
# The min and max only apply to text and blob types
# The distribution field represents the total unique population
# distribution of that column across rows. Supported types are
#
# EXP(min..max) An exponential distribution over the range [min..max]
# EXTREME(min..max,shape) An extreme value (Weibull) distribution over the range [min..max]
# GAUSSIAN(min..max,stdvrng) A gaussian/normal distribution, where mean=(min+max)/2, and stdev is (mean-min)/stdvrng
# GAUSSIAN(min..max,mean,stdev) A gaussian/normal distribution, with explicitly defined mean and stdev
# UNIFORM(min..max) A uniform distribution over the range [min, max]
# FIXED(val) A fixed distribution, always returning the same value
# Aliases: extr, gauss, normal, norm, weibull
#
# If preceded by ~, the distribution is inverted
#
# Defaults for all columns are size: uniform(4..8), population: uniform(1..100B), cluster: fixed(1)
#
columnspec:
- name: name
size: uniform(1..10)
population: uniform(1..10) # the range of unique values to select for the field (default is 100Billion)
- name: date
cluster: uniform(20..40)
- name: lval
population: gaussian(1..1000)
cluster: uniform(1..4)
insert:
partitions: ~exp(20..50) # number of unique partitions to update in a single operation
# if batchcount > 1, multiple batches will be used but all partitions will
# occur in all batches (unless they finish early); only the row counts will vary
batchtype: LOGGED # type of batch to use
select: uniform(1..10)/10 # uniform chance any single generated CQL row will be visited in a partition;
# generated for each partition independently, each time we visit it
#
# A list of queries you wish to run against the schema
#
queries:
simple1:
cql: select * from typestest where name = ? and choice = ? LIMIT 100
fields: samerow # samerow or multirow (select arguments from the same row, or randomly from all rows in the partition)
range1:
cql: select * from typestest where name = ? and choice = ? and date >= ? LIMIT 100
fields: multirow # samerow or multirow (select arguments from the same row, or randomly from all rows in the partition)
@xiaoshao
Copy link

Do you know how much data you dump into the cassandra db by cassandra-stress? I do not know how to calculate it by the columnspec in the profile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment