In the middle of the desert you can say anything you want
Scenario:
uv add etc. to work transparentlyread_api scope and Developer+ rolehttps://__token__:glpat-secret-token@gitlab.de/api/v4/projects/1111/packages/pypi/simplehttps://gitlab.de/api/v4/projects/1111/packages/pypi/simplehttps://gitlab.example.com/api/v4/groups/<group_id>/-/packages/pypi/simple${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/packages/pypiUV_INDEX_PRIVATE_REGISTRY_USERNAME/PASSWORD env variables, replacing PRIVATE_REGISTRY with the name you gave to it in pyproject.toml~/.netrc file: .netrc - everything curlread_api, read_repository, read_registry are enough — and Developer role.https://gitlab.example.com/api/v4/groups/<group_id>/-/packages/pypi/simple
${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/packages/pypi/simpleAdd the new package registry as index
tool.uv.index
name = "my-registry"
url = "https://__token__:glpat-secret-token@gitlab.de/api/v4/projects/1111/packages/pypi/simple"
# authenticate = "always" # see below
# ignore-error-codes = [401]
The URI either contains token inside URI or doesn’t.
The examples below are /projects/xxx, ofc a group registry works as well.
- https://__token__:glpat-secret-token@gitlab.de/api/v4/projects/1111/packages/pypi/simple - token inside the URI
- https://gitlab.de/api/v4/projects/1111/packages/pypi/simple — auth happening through env. variables or ~/.netrc
url = "https://__token__:glpat-secret-token@gitlab.de/api/v4/projects/1111/packages/pypi/simple"
export UV_INDEX_PRIVATE_REGISTRY_USERNAME=__token__
export UV_INDEX_PRIVATE_REGISTRY_PASSWORD=glpat-secret-token
PRIVATE_REGISTRY needs to be replaced with the name of the registry . So e.g. for the pyproject above it’s UV_INDEX_**MY_REGISTRY**_USERNAME.
From Authentication | uv / HTTP Authentication and PyPI packages in the package registry | GitLab Docs:
Create a ~/.netrc:
machine gitlab.example.com
login __token__
password <personal_token>
It will use these details, when you uv add ... -v you’d see a line like
DEBUG Checking netrc for credentials for https://gitlab.de/api/v4/projects/1111/packages/pypi/simple/packagename/
DEBUG Found credentials in netrc file for https://gitlab.de/api/v4/projects/1111/packages/pypi/simple/packagename/
NB git will also use these credentials — so if the token’s scope doesn’t allow e.g. pushing, you won’t be able to git push. Use a wider scope or a personal access token (or env. variables)
uv add yourpackage, uv looks for packages in all registries
pypi one is on by defaultignore-error-codes = [401] to make uv keep looking inside the other registriesCI/CD pipelines have to have access to the package as well, when they run.
GitLab CI/CD job token | GitLab Docs:
You can use a job token to authenticate with GitLab to access another group or project’s resources (the target project). By default, the job token’s group or project must be added to the target project’s allowlist.
In the target project (the one that needs to be resolved, the one with the private registry), in Settings->CI/CD -> Job token permissions add the source project (the one that will access the packages during CI/CD).
You can just add the group parent of all projects as well, then you don’t have to add any individual ones.
Then $CI_JOB_TOKEN can be used to access the target projects. For example, through a ~/.netrc file (note the username!)
machine gitlab.example.com
username gitlab-ci-token
password $CI_JOB_TOKEN
I love firecow/gitlab-ci-local.
When running gitlab-ci-local things, the CI_JOB_TOKEN variable is empty. You can create a .gitlab-ci-local-variables.yaml (don’t forget to gitignore it!) with this variable, it’ll get used automatically and your local CI/CD pipelines will run as well:
CI_JOB_TOKEN=glpat-secret-token
(or just `gitlab-ci-local –variable CI_JOB_TOKEN=glpat-secret-token your-command)
Tutorial and bird’s eye view - Jujutsu docs Git replacement using git under the hood.
First encountered here: Jujutsu for busy devs | Hacker News / Jujutsu For Busy Devs | maddie, wtf?!
For anyone who’s debating whether or not jj is worth learning, I just want to highlight something. Whenever it comes up on Hacker News, there are generally two camps of people: those who haven’t given it a shot yet and those who evangelize it.
Alright, let’s try!
To use on top of an existing git repository:
jj git init --colocate .
jj git clone --colocate git@github.com:maddiemort/maddie-wtf.git
Command line completions: COMPLETE=fish jj | source
How to Firefox | Hacker News / 🦊 How to Firefox - Kaushik Gopal’s Website
- Type
/and start typing for quick find (vs ⌘F). But dig this,'and Firefox will only match text for hyper links- URL bar search shortcuts:
*for bookmarks,%for open tabs,^for history- If you have an obnoxious site disable right click, just hold Shift and Firefox will bypass and show it to you. No add-one required.
(emph mine)
Damn. DAMN.
I need to set this up in qutebrowser as well, it’s brilliant.
For later: Introduction | LLM Inference in Production
autorandr -c vertical-reverse describes my home layout, autorandr -c horizontal describes my work layout. Awesome.
The following virtual configurations are available:
off Disable all outputs
common Clone all connected outputs at the largest common resolution
clone-largest Clone all connected outputs with the largest resolution (sca-
led down if necessary)
horizontal Stack all connected outputs horizontally at their largest re-
solution
vertical Stack all connected outputs vertically at their largest reso-
lution
horizontal-reverse Stack all connected outputs horizontally at their largest re-
solution in reverse order
vertical-reverse Stack all connected outputs vertically at their largest reso-
lution in reverse order
Previously:
Link: EleutherAI/lm-evaluation-harness: A framework for few-shot evaluation of language models.
Running HF model with model args (hf model name in model_args as well):
lm_eval --model hf \
--model_args pretrained=EleutherAI/pythia-160m,revision=step100000,dtype="float" \
--tasks lambada_openai,hellaswag \
--device cuda:0 \
--batch_size 8
YAML+jinja, can run python code in some of the params.
task: coqa
dataset_path: EleutherAI/coqa
output_type: generate_until
training_split: train
validation_split: validation
doc_to_text: !function utils.doc_to_text
doc_to_target: !function utils.doc_to_target
process_results: !function utils.process_results
should_decontaminate: true
doc_to_decontamination_query: "{{story}} {{question.input_text|join('\n')}}"
generation_kwargs:
until:
- "\nQ:"
metric_list:
- metric: em
aggregation: mean
higher_is_better: true
- metric: f1
aggregation: mean
higher_is_better: true
accelerate launch -m lm_eval --model ...--output_path param--log_samples logs samples--use-cache caches stuff and reruns it only when needed--hf_hub_log_args logs the results to HF ! (documentation broken though)simple_evaluate(): lm-evaluation-harness/docs/interface.md at main · EleutherAI/lm-evaluation-harnessSample1:
export HF_HOME="~/.cache/huggingface"
export AZURE_OPENAI_API_KEY=""
export AZURE_OPENAI_API_BASE=""
export AZURE_OPENAI_API_VERSION="2023-07-01-preview"
# pip install git+https://github.com/EvolvingLMMs-Lab/lmms-eval.git
python3 -m lmms_eval \
--model openai_compatible \
--model_args model_version=gpt-4o-2024-11-20,azure_openai=True \
--tasks mme,mmmu_val \
--batch_size 1
Task yamls look very similar: lmms-eval/lmms_eval/tasks/gqa/gqa.yaml at main · EvolvingLMMs-Lab/lmms-eval
Evaluator runs the library given by the Connector
(h6_en is lm-eval)
oaieval gpt-3.5-turbo test-match
Data has to be in JSONL format.
{
"input": [
{
"role": "system",
"content": "You are an assistant with knowledge of U.S. state laws. Answer the questions accurately."
},
{
"role": "user",
"content": "List the states where adultery is technically illegal. Only provide a list of states with no explanation."
}
],
"ideal": "Alabama, Arizona, Florida, Idaho, Illinois, Kansas, Michigan, Minnesota, Mississippi, New York, North Carolina, Oklahoma, Rhode Island, South Carolina, Virginia, Wisconsin, Georgia"
}
Registering the eval2:
<eval_name>:
id: <eval_name>.dev.v0
description: <description>
metrics: [accuracy]
<eval_name>.dev.v0:
class: evals.elsuite.basic.match:Match
args:
samples_jsonl: <eval_name>/samples.jsonl
Sample yaml:3
humor_likert:
prompt: |-
Is the following funny?
{completion}
Answer using the scale of 1 to 5, where 5 is the funniest.
choice_strings: "12345"
choice_scores: from_strings
input_outputs:
input: completion
closedqa: config_dict = yaml.load(yaml_path.read_text())
prompt: |-
You are assessing a submitted answer on a given task based on a criterion. Here is the data:
[BEGIN DATA]
***
[Task]: {input}
***
[Submission]: {completion}
***
[Criterion]: {criteria}
***
[END DATA]
Does the submission meet the criterion? First, write out in a step by step manner your reasoning about the criterion to be sure that your conclusion is correct. Avoid simply stating the correct answers at the outset. Then print only the single character "Y" or "N" (without quotes or punctuation) on its own line corresponding to the correct answer. At the end, repeat just the letter again by itself on a new line.
Reasoning:
eval_type: cot_classify
choice_scores:
"Y": 1.0
"N": 0.0
choice_strings: 'YN'
input_outputs:
input: "completion"
doc_to_choice.langchain/llm/text-davinci-003:
class: evals.completion_fns.langchain_llm:LangChainLLMCompletionFn
args:
llm: OpenAI
llm_kwargs:
model_name: text-davinci-003
langchain/llm/flan-t5-xl:
class: evals.completion_fns.langchain_llm:LangChainLLMCompletionFn
args:
llm: HuggingFaceHub
llm_kwargs:
repo_id: google/flan-t5-xl
Not immediately easily exposed, definitely supports OpenAI and LangChain and HF, but it’s not intuitive.
lighteval accelerate \
"model_name=openai-community/gpt2" \
"leaderboard|truthfulqa:mc|0|0"
The syntax: {suite}|{task}|{num_few_shot}|{0 for strict num_few_shots, or 1 to allow a truncation if context size is too small}
lighteval/community_tasks/_template.py return Doc(
instruction=ZEROSHOT_QA_INSTRUCTION,
task_name=task_name,
query=ZEROSHOT_QA_USER_PROMPT.format(question=line["question"], options=options),
choices=line["choices"],
gold_index=gold_index,
)
yourbench_mcq = LightevalTaskConfig(
name="HF_TASK_NAME", # noqa: F821
suite=["custom"],
prompt_function=yourbench_prompt,
hf_repo="HF_DATASET_NAME", # noqa: F821
hf_subset="lighteval",
hf_avail_splits=["train"],
evaluation_splits=["train"],
few_shots_split=None,
few_shots_select=None,
generation_size=8192,
metric=[Metrics.yourbench_metrics],
trust_dataset=True,
version=0,
)
Many, and model configs are yamls: lighteval/examples/model_configs at main · huggingface/lighteval. For example: lighteval/examples/model_configs/litellm_model.yaml at main · huggingface/lighteval
model_parameters:
model_name: "openai/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"
provider: "openai"
base_url: "https://router.huggingface.co/hf-inference/v1"
generation_parameters:
temperature: 0.5
max_new_tokens: 256
top_p: 0.9
seed: 0
repetition_penalty: 1.0
frequency_penalty: 0.0
Their default prompts: lighteval/src/lighteval/tasks/default_prompts.py at main · huggingface/lighteval
# Run benchmark
helm-run --run-entries mmlu:subject=philosophy,model=openai/gpt2 --suite my-suite --max-eval-instances 10
# Summarize benchmark results
helm-summarize --suite my-suite
A model has metadata (description) and deployment (actually how to run it / implementation). Adding New Models - CRFM HELM Both are yamls.
HF model deployiment (but running locally!):
- name: huggingface/gemma-2-9b-it
model_name: google/gemma-2-9b-it
tokenizer_name: google/gemma-2-9b
max_sequence_length: 8192
client_spec:
class_name: "helm.clients.huggingface_client.HuggingFaceClient"
args:
device_map: auto
torch_dtype: torch.bfloat16
I’m not certain how that connects to their Hugging Face Model Hub Integration - CRFM HELM — tl;dr only AutoModelForCausalLM. To run: helm-run \ --run-entries boolq:model=stanford-crfm/BioMedLM \ --enable-huggingface-models stanford-crfm/BioMedLM \ --suite v1 \ --max-eval-instances 10
All (many!) deployments: helm/src/helm/config/model_deployments.yaml at main · stanford-crfm/helm
vLLM example uses a OpenAI-compatible inference server
Never heard of it but looks cool! And supports many types of evals.
Subjective Evaluation Guidance — OpenCompass 0.4.2 documentation
All configs in Python! models tasks etc.
All configs in python!
# model_cfg.py
from opencompass.models import HuggingFaceCausalLM
models = [
dict(
type=HuggingFaceCausalLM,
path='huggyllama/llama-7b',
model_kwargs=dict(device_map='auto'),
tokenizer_path='huggyllama/llama-7b',
tokenizer_kwargs=dict(padding_side='left', truncation_side='left'),
max_seq_len=2048,
max_out_len=50,
run_cfg=dict(num_gpus=8, num_procs=1),
)
]
OpenAI:
from opencompass.models import OpenAI
models = [
dict(
type=OpenAI, # Using the OpenAI model
# Parameters for `OpenAI` initialization
path='gpt-4', # Specify the model type
key='YOUR_OPENAI_KEY', # OpenAI API Key
max_seq_len=2048, # The max input number of tokens
# Common parameters shared by various models, not specific to `OpenAI` initialization.
abbr='GPT-4', # Model abbreviation used for result display.
max_out_len=512, # Maximum number of generated tokens.
batch_size=1, # The size of a batch during inference.
run_cfg=dict(num_gpus=0), # Resource requirements (no GPU needed)
),
]
Same creators as the above one, multimodal eval.
From their README:
import pytest
from deepeval import assert_test
from deepeval.metrics import GEval
from deepeval.test_case import LLMTestCase, LLMTestCaseParams
def test_case():
correctness_metric = GEval(
name="Correctness",
criteria="Determine if the 'actual output' is correct based on the 'expected output'.",
evaluation_params=[LLMTestCaseParams.ACTUAL_OUTPUT, LLMTestCaseParams.EXPECTED_OUTPUT],
threshold=0.5
)
test_case = LLMTestCase(
input="What if these shoes don't fit?",
# Replace this with the actual output from your LLM application
actual_output="You have 30 days to get a full refund at no extra cost.",
expected_output="We offer a 30-day full refund at no extra costs.",
retrieval_context=["All customers are eligible for a 30 day full refund at no extra costs."]
)
assert_test(test_case, [correctness_metric])
deepeval set-local-model --model-name=<model_name> \
--base-url="http://localhost:8000/v1/" \
--api-key=<api-key>
@observe decorators, “avoiding rewriting your app just for testing”https://github.com/EvolvingLMMs-Lab/lmms-eval/blob/main/examples/models/openai_compatible.sh: ↩︎
https://github.com/openai/evals/blob/main/evals/registry/modelgraded/closedqa.yaml[evals/evals/registry/modelgraded/humor.yaml at main · openai/evals](https://github.com/openai/evals/blob/main/evals/registry/modelgraded/humor.yaml); https://github.com/openai/evals/blob/main/evals/registry/modelgraded/closedqa.yaml ↩︎
Just some really quick notes on this, it’s pointless and redundant but I’ll need these later
Create .yaml with a github models model:
model_list:
# - model_name: github-Llama-3.2-11B-Vision-Instruct # Model Alias to use for requests
- model_name: minist # Model Alias to use for requests
litellm_params:
model: github/Ministral-3B
api_key: "os.environ/GITHUB_API_KEY" # ensure you have `GITHUB_API_KEY` in your .env
After setting GITHUB_API_KEY, litellm --config config.yaml
python3 -m fastchat.serve.controller{
"minist": {
"model_name": "minist",
"api_base": "http://0.0.0.0:4000/v1",
"api_type": "openai",
"api_key": "whatever",
"anony_only": false
}
}
python3 -m fastchat.serve.gradio_web_server_multi --register-api-endpoint-file ../model_config.jsonTIL about 1C when I had to move it from one windows laptop to another. First and last windows post here hopefully
Ref: 1С – как перенести базу на другой компьютер
Long story short:
echo -e '\e[1mbold\e[22m'
echo -e '\e[2mdim\e[22m'
echo -e '\e[3mitalic\e[23m'
echo -e '\e[4munderline\e[24m'
echo -e '\e[4:1mthis is also underline (since 0.52)\e[4:0m'
echo -e '\e[21mdouble underline (since 0.52)\e[24m'
echo -e '\e[4:2mthis is also double underline (since 0.52)\e[4:0m'
echo -e '\e[4:3mcurly underline (since 0.52)\e[4:0m'
echo -e '\e[4:4mdotted underline (since 0.76)\e[4:0m'
echo -e '\e[4:5mdashed underline (since 0.76)\e[4:0m'
echo -e '\e[5mblink (since 0.52)\e[25m'
echo -e '\e[7mreverse\e[27m'
echo -e '\e[8minvisible\e[28m <- invisible (but copy-pasteable)'
echo -e '\e[9mstrikethrough\e[29m'
echo -e '\e[53moverline (since 0.52)\e[55m'
echo -e '\e[31mred\e[39m'
echo -e '\e[91mbright red\e[39m'
echo -e '\e[38:5:42m256-color, de jure standard (ITU-T T.416)\e[39m'
echo -e '\e[38;5;42m256-color, de facto standard (commonly used)\e[39m'
echo -e '\e[38:2::240:143:104mtruecolor, de jure standard (ITU-T T.416) (since 0.52)\e[39m'
echo -e '\e[38:2:240:143:104mtruecolor, rarely used incorrect format (might be removed at some point)\e[39m'
echo -e '\e[38;2;240;143;104mtruecolor, de facto standard (commonly used)\e[39m'
echo -e '\e[46mcyan background\e[49m'
echo -e '\e[106mbright cyan background\e[49m'
echo -e '\e[48:5:42m256-color background, de jure standard (ITU-T T.416)\e[49m'
echo -e '\e[48;5;42m256-color background, de facto standard (commonly used)\e[49m'
echo -e '\e[48:2::240:143:104mtruecolor background, de jure standard (ITU-T T.416) (since 0.52)\e[49m'
echo -e '\e[48:2:240:143:104mtruecolor background, rarely used incorrect format (might be removed at some point)\e[49m'
echo -e '\e[48;2;240;143;104mtruecolor background, de facto standard (commonly used)\e[49m'
echo -e '\e[21m\e[58:5:42m256-color underline (since 0.52)\e[59m\e[24m'
echo -e '\e[21m\e[58;5;42m256-color underline (since 0.52)\e[59m\e[24m'
echo -e '\e[4:3m\e[58:2::240:143:104mtruecolor underline (since 0.52) (*)\e[59m\e[4:0m'
echo -e '\e[4:3m\e[58:2:240:143:104mtruecolor underline (since 0.52) (might be removed at some point) (*)\e[59m\e[4:0m'
echo -e '\e[4:3m\e[58;2;240;143;104mtruecolor underline (since 0.52) (*)\e[59m\e[4:0m'