Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

content & contrib: minor updates #33

Merged
merged 2 commits into from
Sep 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion .vscode/jupyterbook.code-snippets
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"BibTeX URL": {
"BibTeX URL (misc)": {
"scope": "bibtex",
"prefix": "@online",
"body": [
Expand All @@ -12,6 +12,20 @@
],
"description": "Add a website citation"
},
"BibTeX URL (news)": {
"scope": "bibtex",
"prefix": "@article",
"body": [
"@article{${1:key},",
"title={$3},",
"author={$4},",
"year=${5:lastUpdated},",
"journal={$6},"
"url={$2}",
"}"
],
"description": "Add a news website citation"
},
"Figure (external)": {
"scope": "markdown",
"prefix": "fig-ext",
Expand Down
16 changes: 13 additions & 3 deletions licences.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,15 @@
% TODO: https://tldr.cdcl.ml/tags/#law
% TODO: summary graphic?

Concerning {term}`IP` in software-related fields, developers are likely aware of two "[open](open)" copyright licence categories: one for highly structured work (e.g. software), and the other for general content (e.g. prosaic text and images). These two categories needed to exist separately to solve problems unique to their domains, and thus were not designed to be compatible. A particular piece of work is expected to fall into just one category, not both.
Concerning {term}`IP` in software-related fields, developers are likely aware of two "[open](open)" copyright licence categories: one for highly structured work (e.g. software), and the other for general content (e.g. [data](data) including prosaic text and images). These two categories needed to exist separately to solve problems unique to their domains, and thus were not designed to be compatible. A particular piece of work is expected to fall into just one category, not both.

Copyright for ML models, however, is more nuanced.
Copyright for [ML models](ml-models), however, is more nuanced.

Aside from categorisation, a further complication is the lack of [legal precedence](legal-precedence). A licence is not necessarily automatically legally binding -- it may be [incompatible with existing laws](copyright-exceptions). Furthermore, in an increasingly global workplace, it may be unclear which country's laws should be applicable in a particular case.
Aside from categorisation, a further complication is the lack of [legal precedence](legal-precedence). A licence is not necessarily automatically legally binding -- it may be [incompatible with existing laws](copyright-exceptions). Furthermore, in an increasingly global workplace, it may be unclear [which country's laws](national-laws) should be applicable in a particular case.

Finally, licence terms disclaiming warranty/liability are contributing to an [accountability crisis](accountability-crisis).

(ml-models)=

## ML Models

Expand Down Expand Up @@ -54,6 +58,8 @@ Some interesting observations currently:

Licences are increasingly being recognised as important, and are even mentioned in some online leaderboards such as [LMSys ChatBot Arena](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard).

(data)=

## Data

As briefly alluded to, data and code are often each covered by their own licence categories -- but there may be conflicts when these two overlap. For example, pre-trained weights are a product of both code and data. This means one licence intended for non-code work (i.e. data) and another licence intended for code (i.e. model architectures) must simultaneously apply to the weights. This may be problematic or even nonsensical.
Expand Down Expand Up @@ -96,6 +102,8 @@ Subcategory | Conditions | Licence examples

One big problem is enforcing licence conditions (especially of {term}`copyleft` or even more restrictive licences), particularly in an open-source-centric climate with potentially billions of infringing users. It is a necessary condition of a law that it should be enforceable {cite}`law-enforceability`, which is infeasible with most current software {cite}`linux-warranty,cdcl-policing-foss,cdcl-os-illegal`.

(national-laws)=

## National vs International Laws

(copyright-exceptions)=
Expand All @@ -122,6 +130,8 @@ Only rare cases involving lots of money or large organisations go to court {cite
- Jun 2023 privacy case {cite}`openai-privacy-case` against Microsoft & OpenAI
- Nov 2022 copyright and open source licences case {cite}`legalpdf-doe-github-case` against GitHub

(accountability-crisis)=

## Accountability Crisis

Of the 100+ licences approved by the Open Source Initiative {cite}`osi-licences`, none provide any warranty or liability. In fact, all expressly **disclaim** warranty/liability (apart from [`MS-PL`](https://learn.microsoft.com/en-us/previous-versions/msp-n-p/ff647676(v=pandp.10)?redirectedfrom=MSDN) and [`MS-RL`](https://opensource.org/license/ms-rl-html), which don't expressly mention liability).
Expand Down