title: "Issue611"
author: "Douglas Bates"
jupyter: julia-1.7
date: today
date-format: iso
keep-ipynb: true
## Load packages and data
using CSV, MixedModels, ProgressMeter, Tables
Recent developments in Julia make it convenient to name the weights vector `wts`, which is the name of the named argument in the `LinearMixedModel` constructor,
dat ="./data/model_ready_data_small.csv", columntable)
wts ="./data/weights.csv", columntable).weight_vector
Also, assign a `contrasts` specification of `Grouping` to the grouping factors
contrasts = Dict(
:brand_category_channel_type => Grouping(),
:channel_type_sub_brand => Grouping(),
With the `Grouping` contrast there is no need to change from an Integer to a Categorical type.
In general I would recommend prepending a character to the number in the CSV file so that the column is read as a String or a `PooledArray`, which is like a Categorical array.
For example,
brand = string.('B', lpad.(dat.brand_category_channel_type, 2, '0'))
Recently I do any permanent storage of data frames as arrow files (see [Arrow.jl]( because they retain the metadata and you don't need to go through the conversion from CSV to an internal representation followed by transformations.
## Initial model fits
Before going to vector-valued random effects I would try just random intercepts to see what kinds of fits they produce.
I write this now as a let block so that I can define the formula but it doesn't clutter up the namespace.
m1 = let
form = @formula(
volume ~ 1 + ppi + base_price +
fit(MixedModel, form, dat; contrasts, wts)
As you can see, it converges on the boundary with a variance component of zero.
m2 = let
form = @formula(
volume ~ 1 + ppi + base_price +
zerocorr(1 + ppi | channel_type_sub_brand)
fit(MixedModel, form, dat; contrasts, wts)
m2a = let
form = @formula(
volume ~ 1 + ppi + base_price +
(0 + ppi | channel_type_sub_brand)
fit(MixedModel, form, dat; contrasts, wts)
m3 = let
form = @formula(
volume ~ 1 + ppi + base_price +
(1 | channel_type_sub_brand) +
(1 | brand_category_channel_type)
fit(MixedModel, form, dat; contrasts, wts)
m4 = let
form = @formula(
volume ~ 1 + ppi + base_price +
(1 | channel_type_sub_brand) +
zerocorr(1 + ppi | brand_category_channel_type)
LinearMixedModel(form, dat; contrasts, wts)
@warn "Cholesky factorization failed"
m5 = let
form = @formula(
volume ~ 1 + ppi + base_price +
(0 + ppi | channel_type_sub_brand) +
(0 + ppi | brand_category_channel_type)
fit(MixedModel, form, dat; contrasts, wts)
So the model seems to come down to a random effect for the slope w.r.t. `ppi` by `channel_type_sub_brand`, model `m2a`, and not much else.
