In the middle of the desert you can say anything you want
vim.opt.xxx = yy
is the most conv. way to set all local global options.vim.wo.relativenumber = true
vim.wo.cursorcolumn = true
vim.wo.cursorline = true
vim.wo.colorcolumn = "80"
-- :set tabstop=8 shiftwidth=4 softtabstop=4 expandtab shiftround
vim.g.tabstop = "8"
vim.g.softtabstop = "8"
vim.g.shiftwidth = "4"
vim.g.expandtab = true
vim.g.smarttab = true
Deepeval’s metrics as given in the llamaindex docs:
from deepeval.integrations.llama_index import ( DeepEvalAnswerRelevancyEvaluator, DeepEvalFaithfulnessEvaluator, DeepEvalContextualRelevancyEvaluator, DeepEvalSummarizationEvaluator, DeepEvalBiasEvaluator, DeepEvalToxicityEvaluator, )
if x is not None(): ...
* did you know __init__.py is optional nowadays?
* you can do relative imports with things like "from ..other import foo"
* since 3.13 there is a @deprecated decorator that does what you think it does
* the new generics syntax also works on methods/functions: "def method[T](...)" very cool
* you can type kwargs with typeddicts and unpack: "def fn(*kwargs: Unpack[MyKwargs])"
* dataclasses (and pydantic) support immutable objects with: "class MyModel(BaseModel, frozen=True)" or "@dataclass(frozen=True)"
* class attributes on dataclasses, etc. can be defined with "MY_STATIC: ClassVar[int] = 42" this also supports abstract base classes (ABC)
* TypeVar supports binding to enforce subtypes: "TypeVar['T', bound=X]", and also a default since 3.13: "TypeVar['T', bound=X, default=int]"
* @overload is especially useful for get() methods to express that the return can't be none if the default isn't None
* instead of Union[a, b] or Optional[a] you can write "a | b" or "a | None" nowadays
* with match you can use assert_never() to ensure exhaustive matching in a "case _:" block
* typing has reveal_type() which lets mypy print the type it thinks something is
* typing's "Self" allows you to more properly annotate class method return types
* the time package has functions for monotonic clocks and others not just time()
Ignoring files:
# type: ignore
# flake8: noqa
# pylint: skip-file
[]
"files.autoSave": "onFocusChange",
"[python]": {
"editor.formatOnSave": true,
// "editor.defaultFormatter": "charliermarsh.ruff",
"editor.defaultFormatter": "ms-python.black-formatter",
// reformat everything w/ ruff
"editor.codeActionsOnSave": {
"source.fixAll": "explicit",
"source.organizeImports": "explicit"
},
},
"editor.rulers": [
78, 88
],
"editor.lineNumbers": "relative",
"editor.formatOnPaste": true,
"editor.formatOnSave": true, // non-python stuff like settings.json
<S-j>
/<S-k>
to switch tabs
"vim.normalModeKeyBindingsNonRecursive": [
{
"before": [
"J"
],
"after": [],
"commands": [
"workbench.action.nextEditor"
]
},
{
"before": [
"K"
],
"after": [],
"commands": [
"workbench.action.previousEditor"
]
}
],
//"editor.fontFamily": "'Droid Sans Mono', 'monospace', monospace",
"editor.fontFamily": "Fira Code",
"editor.fontLigatures": true,
[2303.16634] G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment 1
1*chance-of-1+2*chance-of-2
[2302.04166] GPTScore: Evaluate as You Desire2
{Task_Specification} {Aspect_Definition} Text: {Text} Tl;dr: {Summ}
UniEval [2210.07197] Towards a Unified Multi-Dimensional Evaluator for Text Generation3
LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models4
Main goal: coherence with human judgement
criterium_name
the output is from 0 to 5
3
but G-Eval works around thatCar has 4 wheels
family with 10 kids 5 dogs living in the Australian bush
ROBUST car with 4 EXTRA LARGE WHEELS made of AUSTRALIAN METAL able to hold 12 KIDS and AT LEAST 8 DOGS
number_of_wheels: 4
), formulate questions based on each, and score better the adverts that contain answers to more questions!
G-Eval: <_(@liuGEvalNLGEvaluation2023) “G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment” (2023) / Yang Liu, Dan Iter, Yichong Xu, Shuohang Wang, Ruochen Xu, Chenguang Zhu: z / http://arxiv.org/abs/2303.16634 / 10.48550/arXiv.2303.16634 _> ↩︎ ↩︎ ↩︎ ↩︎
<_(@fuGPTScoreEvaluateYou2023) “GPTScore: Evaluate as You Desire” (2023) / Jinlan Fu, See-Kiong Ng, Zhengbao Jiang, Pengfei Liu: z / http://arxiv.org/abs/2302.04166 / 10.48550/arXiv.2302.04166
_> ↩︎ ↩︎
<_(@zhongUnifiedMultiDimensionalEvaluator2022) “Towards a Unified Multi-Dimensional Evaluator for Text Generation” (2022) / Ming Zhong, Yang Liu, Da Yin, Yuning Mao, Yizhu Jiao, Pengfei Liu, Chenguang Zhu, Heng Ji, Jiawei Han: z / http://arxiv.org/abs/2210.07197 / 10.48550/arXiv.2210.07197
_> ↩︎
<_(@liusieLLMComparativeAssessment2024) “LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models” (2024) / Adian Liusie, Potsawee Manakul, Mark J. F. Gales: z / http://arxiv.org/abs/2307.07889 / 10.48550/arXiv.2307.07889
_> ↩︎ ↩︎
<_(@wangAskingAnsweringQuestions2020) “Asking and Answering Questions to Evaluate the Factual Consistency of Summaries” (2020) / Alex Wang, Kyunghyun Cho, Mike Lewis: z / http://arxiv.org/abs/2004.04228 / 10.48550/arXiv.2004.04228 _> ↩︎ ↩︎ ↩︎
<_(@fabbriSummEvalReevaluatingSummarization2021) “SummEval: Re-evaluating Summarization Evaluation” (2021) / Alexander R. Fabbri, Wojciech Kryściński, Bryan McCann, Caiming Xiong, Richard Socher, Dragomir Radev: z / https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00373/100686/SummEval-Re-evaluating-Summarization-Evaluation / 10.1162/tacl_a_00373 _> ↩︎
<_(@gopalakrishnanTopicalChatKnowledgeGroundedOpenDomain2023) “Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations” (2023) / Karthik Gopalakrishnan, Behnam Hedayatnia, Qinlang Chen, Anna Gottardi, Sanjeev Kwatra, Anu Venkatesh, Raefer Gabriel, Dilek Hakkani-Tur: z / http://arxiv.org/abs/2308.11995 / 10.48550/arXiv.2308.11995 _> ↩︎
Tried to use subfig
for two figures side by side, but couldn’t \autoref
it.
I had a caption for the individual subfigs but not for the large figure itself. As soon as I added the caption it worked.
\begin{figure}%
\centering
\subfloat[\centering caption subfig 1]{{\includegraphics[width=0.4\linewidth]{images/fig2.png}}}%
\qquad
\subfloat[\centering caption subfig 2]{{\includegraphics[width=0.4\linewidth]{images/fig4.png} }}%
\caption{Without }
\label{fig:twosamples}%
\end{figure}
\autoref{fig:twosamples}
Pydantic’s FilePath
is like Path except that the file has to exist and be a file.
BUT FilePath
when validating expects a string as input, not a Path!
(in other words: FilePath(Path)
doesn’t seem to work)
So when I create a Validator that converts str
into Path
1:
@field_validator("filename", mode="before")
@classmethod
def parse_filename(cls, value: str | Path) -> Path:
return Path(value)
I get a wonderful
> doc = UCFDocument.model_validate_json(json_string)
E pydantic_core._pydantic_core.ValidationError: 1 validation error for UCFDocument
E filename
E Input is not a valid path for <class 'pathlib.Path'> [type=path_type, input_value=PosixPath('/home/sh/w/cor...n/doc.pdf_data/doc.pdf'), input_type=PosixPath]
tests/ucf/test_data_structures.py:179: ValidationError
Again, the error is a PosixPath
not being a Path, though it is one:
E Input is not a valid path for <class 'pathlib.Path'> [type=path_type, input_value=PosixPath('/home/sh/w/cor...n/doc.pdf_data/doc.pdf'), input_type=PosixPath]
# explicitly expecting a PosixPath creates an even better
E Input is not a valid path for <class 'pathlib.PosixPath'> [type=path_type, input_value=PosixPath('/home/sh/w/cor...n/doc.pdf_data/doc.pdf'), input_type=PosixPath]
Not intuitive at all.
Solution is to give FilePath strings and only strings, or drop FilePath
to begin with.
├── pydantic v2.10.6
│ ├── annotated-types v0.7.0
│ ├── pydantic-core v2.27.2
│ │ └── typing-extensions v4.12.2
(don’t ask why I needed this, this is a minimal reproducible example only) ↩︎
CVAT is a really neat labelling platform, online + free on-premise w/ Docker.
(Github: cvat-ai/cvat: Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.)
I like it more than label studio for images, has more functions, but is also “heavier” / bulkier.
Love how it supports even 600mb 4-channel TIFF satellite images and is quite fast at that.
Bits:
<C-a>
for snipping polygons to existing polygon ponitsThe Open LLM Leaderboard is dead1, as good time as any to look for new eval stuff!
HF universe
Harnesses
Resources / articles: