Tabmemcheck
===========
.. image:: https://img.shields.io/pypi/v/tabmemcheck.svg
:target: https://img.shields.io/pypi/v/tabmemcheck
Tabmemcheck is an open-source Python library to test language models for memorization of tabular datasets. It provides four different tests for verbatim memorization of a tabular dataset (header test, row completion test, feature completion test, first token test).
The submodule ``tabmemcheck.datasets`` allows to load tabular datasets in perturbed form (``original``, ``perturbed``, ``task``, ``statistical``). The perturbations are specified in a YAML file for each dataset. Examples are contained in ``tabmemcheck.resources.config.transform``.
.. code:: ipython3
import tabmemcheck
Header Test - Asks the LLM to complete the initial rows of a csv file
-----------------------------------------------------------------------
The example provides evidence of memorization of the UCI Wine dataset in ``gpt-3.5-turbo-0613``.
.. code:: ipython3
header_prompt, header_completion, response = tabmemcheck.header_test('uci-wine.csv', 'gpt-3.5-turbo-0613', completion_length=350)
.. raw:: html
target,alcohol,malic_acid,[...],proline
1,14.23,1.71,2.43,15.6,127,2.8,3.06,.28,2.29,5.64,1.04,3.92,1065
1,13.2,1.78,2.14,11.2,100,2.65,2.76,.26,1.28,4.38,1.05,3.4,1050
1,13.16,2.36,2.67,18.6,101,2.8,3.24,.39,2.81,2.29,5.68,1.03,3.17,1185
1,14.37,1.95,2.5,16.8,113,3.85,3.49,.24,2.18,7.8,0.86,3.45,1480
1,13.24,2.59,2.87,21,118,2.8,2.69,.39,1.82,4.32,1.04,2.93,735
1,14.2,1.76,2.45,15.2,112,3.27,3.39,.34,1.97,6.75,1.05,2.85,1450
1,14.39,1.87,2.45,14.6,96,2.5,2.52,.3,1.98,5.25,1.02,3.58,1290
1,14.06,2.15,2.61,17.6,121,2.6,2.51,.31,1.25,5.05,1.06,3.58,1295
1,14.83,1.64,2.17,14,97,2.8,2.98,.29,1.98,5.2,1.08,2.85,1045
1,13.86,1.35,2.27,16,98,2.98,3.15,.22,1.85,7.22,1.01,3.55,1045
1,14.1,2.16,2.3,18,105,2.95,3.32,.22,2.38,5.75,1.25,3.17,1510
1,14.12,1.48,2.32,16.8,95,2.2,2.43,.26,1.57,5,1.17,2.82,1280
1,13.7
Legend: Prompt, Correct, Incorrect, Missing
The function visualizes the Levenshtein string distance between the actual header and the model completion.
Row Completion Test -- Ask the LLM to complete random rows of a csv file
--------------------------------------------------------------------------
The example provides evidence of memorization of the Iris dataset in ``gpt-4-0125-preview``.
.. code:: ipython3
rows, responses = tabmemcheck.row_completion_test('iris.csv', 'gpt-4-0125-preview', num_queries=25)
.. raw:: html
5,3.5,1.3,0.3,Iris-setosa
5.9,3.2,4.8,1.8,Iris-versicolor
6.9,3.2,5.7,2.3,Iris-virginica
5.7,3.8,1.7,0.3,Iris-setosa
6.7,3.1,5.6,2.4,Iris-virginica
5.5,2.5,4.9,1.3,Iris-versicolor
6.3,2.8,5.1,1.5,Iris-virginica
6.4,3.2,4.5,1.5,Iris-versicolor
7.3,2.9,6.3,1.8,Iris-virginica
6,2.2,5,1.5,Iris-virginica
6.1,2.6,5.6,1.4,Iris-virginica
4.8,3.4,1.9,0.2,Iris-setosa
6.3,2.7,4.9,1.8,Iris-virginica
6.8,3.2,5.9,2.3,Iris-virginica
6.3,3.3,4.7,1.6,Iris-versicolor
5.9,3,4.2,1.5,Iris-versicolor
4.4,3.2,1.3,0.2,Iris-setosa
6.3,2.9,5.6,1.8,Iris-virginica
5.2,4.1,1.5,0.1,Iris-setosa
6.7,3,5,1.7,Iris-versicolor
5.7,4.4,1.5,0.4,Iris-setosa
5,3.5,1.6,0.6,Iris-setosa
7.1,3,5.9,2.1,Iris-virginica
6,2.7,5.1,6,1.6,Iris-versicolor
5.5,2.6,4.4,1.2,Iris-versicolor
Legend: Prompt, Correct, Incorrect, Missing
Feature Completion Test -- Asks the LLM to complete the value of a specific feature in a csv file
--------------------------------------------------------------------------------------------------
The example provides evidence of memorization of the Kaggle Titanic dataset in ``gpt-3.5-turbo-0125``.
.. code:: ipython3
feature_values, responses = tabmemcheck.feature_completion_test('/home/sebastian/Downloads/titanic-train.csv', 'gpt-3.5-turbo-0125', feature_name='Name', num_queries=25)
.. raw:: html
Lester, Mr. James
Meanwell, Miss. (Marion Ogden)
Funk, Miss. Annie Clemmer
McGovern, Miss. Mary
Tikkanen, Mr. Juho
Goodwin, Master. Sidney Leonard
Vander Planke, Mr. Leo Edmondus
Vovk, Mr. Janko
Elsbury, Mr. William James
Goodwin, Master. Harold Victor
Abbott, Mrs. Stanton (Rosa Hunt)
Marvin, Mr. Daniel Warner
Ilmakangas, Miss. Pieta Sofia
Cameron, Miss. Clear Annie
Chambers, Mr. Norman Campbell
Culumovic, Mr. Jeso
Fox, Mr. Stanley Hubert
Palsson, Miss. Stina Viola
Brown, Mrs. James Joseph (Margaret Tobin)
Williams, Mr. Charles Duane
Phillips, Miss. Kate Florence ("Mrs Kate Louise Phillips Marshall")
Sage, Miss. Dorothy Edith "Dolly"
Eklund, Mr. Hans Linus
Bowerman, Miss. Elsie Edith
Landergren, Miss. Aurora Adelia
Legend: Prompt, Correct, Incorrect, Missing
First Token Test -- Asks the LLM to complete the value of the first token in the next row of a csv file
--------------------------------------------------------------------------------------------------------
The example provides no evidence of memorization of the Adult Income dataset in ``gpt-3.5-turbo-0125``.
.. code:: ipython3
tabmemcheck.first_token_test('adult-train.csv', 'gpt-3.5-turbo-0125', num_queries=100)
.. parsed-literal::
First Token Test: 37/100 exact matches.
First Token Test Baseline (Matches of most common first token): 50/100.
You can see all prompts that are being send to the model, and the raw responses
-------------------------------------------------------------------------------
.. code:: ipython3
tabmemcheck.config.print_prompts = True
.. toctree::
:maxdepth: 2
:caption: Contents:
api_reference