gui/README.md

Tisane GUI

Purpose: Asking disambiguation questions, offering interactions for helping analysts answer those questions, and saving answers to use when inferring a statistical model from a Tisane specification.

The motivating principle behind the interaction model behind Tisane, which includes the GUI and disambiguation questions, is to derive statistical models based on conceptual and data measurement relationships analysts express.

Key challenge

The primary challenge we are trying to address through the Tisane GUI is to (1) suggest potential oversights in initial model specification at the query level based on the variable relationships the analyst expressed and (2) elicit runtime data-specific information (i.e., family and link functions). Both these tasks are necessary in order to generate a final statistical model and ensure that it has statistical conclusion and external validity.

A secondary challenge is to discourage analysts from p-hacking their statistical models. This challenge is already baked into our approach in designing Tisane because analysts author programs that describe variables and variable relationships, not statistical models directly in

In order to address these challenges, we have specific design goals, below.

Design goals

  1. Provide an overiew of the query and interactively added variables to keep "local context" that expresses/represents analyst goals.
  2. Require analysts to consider omissions/alternatives that further their analysis goals.
  3. Guide users through rationale for why Tisane makes specific suggestsions. Imagine users who may not know what a "main effect" or "random effect" is.
  4. This means that we should define terms wherever possible.
  5. This also means that we should indicate implications of decisions. (?)

    FEEDBACK REQUESTED: Do you think we should have any additional design goals/requirements?

Features to add

Feel free to update this list/keep track of them based on the design goals! - If a set of effects (e.g., interaction effects) is not necessary, maybe have an explanation of why that tab isn't necessary/applicable and suggestion for how to update program in order to specify interactions? [Design goal 3]

Data flow overview

  1. Once an analyst runs a Tisane program, the Tisane compiler will generate candidate main, interaction, and random effects and candidate family/link functions.
  2. The Tisane compiler will ouptut a JSON file with all these candidates/options.
  3. The Tisane GUI reads in this output JSON file and should keep track of/populate the app data according to the rules outlined in INFERENCE.md. For example, only interaction effects with two or more main effects (including the variables included in the query) should be shown as options even though the Tisane compiler passes all possible interactions to the GUI.
  4. When the analyst generates code, the Tisane GUI outputs a JSON file that the code generator reads in to generate the Python script.

    We might need to update/tweak this data flow design

Sketches

Here are the sketches we went over August 17

Previous code

From UIST 2021 submission: https://github.com/emjun/tisane/blob/main/tisane/smt/input_interface.py - For generating family functions, see LOC. It calls a helper function generate_data_dist_from_facts, see function definition. This functionality is now found in the Family instances themselves, see family.py here. Each family has a simulate_data function that needs to be updated so that the appropriate parameters are passed. It might also make sense to make these simulate_data functions class functions rather that instance functions.

Development notes

Files

Functions

The main function for generating JSON from Tisane programs is collect_model_candidates in tisane/main.py. The output of this function is a JSON object that can be written out to a local file by calling write_to_json in tisane/main.py. The JSON object/file has the following general structure:

# Example taken from gui/example_inputs/main_interaction_random_intercept_slope_correlated.json
{
    "input": {
        "query": {
            "DV": "Time",
            "IVs": [
                "Word_type"
            ]
        },
        "generatedMainEffects": [
            "Word_type"
        ],
        "generatedInteractionEffects": [],
        "generatedRandomEffects": {
            "Subject": [
                {
                    "random intercept": {
                        "groups": "Subject"
                    }
                },
                {
                    "random slope": {
                        "iv": "Word_type",
                        "groups": "Subject"
                    }
                },
                {
                    "correlated": true
                }
            ],
            "Word": [
                {
                    "random intercept": {
                        "groups": "Word"
                    }
                }
            ]
        },
        "generatedFamilyLinkFunctions": {
            "GaussianFamily": [
                "LogitLink",
                "LogLogLink",
                "ProbitLink",
                "CLogLogLink",
                "PowerLink",
                "IdentityLink",
                "OPowerLink",
                "LogLink",
                "NegativeBinomialLink"
            ],
            "InverseGaussianFamily": [
                "LogLink",
                "IdentityLink",
                "PowerLink"
            ],
            "PoissonFamily": [
                "PowerLink",
                "LogLink",
                "IdentityLink"
            ],
            "TweedieFamily": [
                "PowerLink",
                "LogLink",
                "IdentityLink"
            ],
            "GammaFamily": [
                "IdentityLink",
                "PowerLink",
                "LogLink"
            ]
        },
        "measures to units": {
            "Time": "Subject",
            "Word_type": "Subject"
        }
    }
}

Example Plotly Dash apps

Testing

Styling

Dash Bootstrap documentation: https://dash-bootstrap-components.opensource.faculty.ai/docs/



emjun/tisaner documentation built on Nov. 9, 2024, 6:32 p.m.