Compare commits

...

No commits in common. "assets" and "master" have entirely different histories.

11 changed files with 9774 additions and 0 deletions

21
CONTRIBUTING.md Normal file
View File

@ -0,0 +1,21 @@
# How to Contribute
The easiest way to contribute is via CrowdAnki. Install Anki and add the addon CrowdAnki to it.
Then, you can add this repo via `File -> CrowdAnki: Import from git repository`.
After you've done the modifications you want to, export the deck via `File -> Export -> CrowdAnki JSON representation`
These changes can then be added via a normal git pull request.
(If you're not too familiar with git, you can also just send the json to my email: abocken@ethz.ch, then I will add the changes.)
## Editing fields
You've probably noticed how the fields I'm using aren't plain text but are HTML.
If you want to edit a field in a major way, please familiarize yourself with the HTML representation of each field.
I think you'll quickly get the format after looking at one or two examples.
You can view the html of a field in the Browse window via:
Select a card -> select the field you want to edit -> click on the burger menu (the one to the right where all the format options like bold and italics are) -> edit HTML
`Ctrl+Shift+X`should also open up the HTML of the field.
Shoot me an email if there is anything unclear to you, I'll gladly help you. (abocken@ethz.ch)

21
LICENSE Normal file
View File

@ -0,0 +1,21 @@
MIT License
Copyright (c) 2020 Alexander Bocken
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

33
README.md Normal file
View File

@ -0,0 +1,33 @@
# AVL
<img src="/../assets/total_front.png" alt="Example of front of card" width="49%"> <img src="/../assets/total_back.png" alt="Example of back of card" width="49%">
This is a vocabulary list for Anki derived from the Advanced Vocabulary List (AVL). It includes no. 201 to no. 520.
This vocab list is for the purpose of stuyding these words for my course [Advanced English for academic purposes C1-C2](https://www.sprachenzentrum.uzh.ch/de/angebot/Kursdetail.html?sprachid=sprache:englisch&targetgpid=zielgruppe:studentETHZ&kursnr=217424a3-e447-4531-9d39-3b04ea63006e) at university.
A heavy focus is set on cognates and collocations for each word family, not just a definition for each word.
## How to use/import
You can easily import this deck via either the avilable `AVL.apkg` or CrowdAnki.
(Information for the latter see below)
The apkg files can be found in the ['Releases' section](https://github.com/AlexBocken/AVL/releases) of this github.
I'm using Anki as my vocab program with the Addon [CrowdAnki](https://github.com/Stvad/CrowdAnki) to export in a diff-friendly json-file. You can easily add it to your Anki setup via the code given on [its AnkiWeb page](https://ankiweb.net/shared/info/1788670778).
## Implementation
For time management reasons, I've decided to just rip the information from [ozdic.com](http://www.ozdic.com/).
This leaves me with 50 words that couldn't be found on ozdic.
(See the TODO.md for the list of words yet to be added correctly.)
These will be added manually from time to time over the coming winter semester 2020.
### Script
The script I've implemented to rip from ozdic seems to work fine in general, but has a few issues from time to time.
Expect some weird formatting at points.
These will be fixed over the coming weeks.
## License
MIT

51
TODO.md Normal file
View File

@ -0,0 +1,51 @@
| no. | word |
| ---- | ------- |
| 241 | regarding |
| 242 | european |
| 268 | construct |
| 274 | multiple |
| 276 | overall |
| 289 | urban |
| 292 | mental |
| 298 | visual |
| 299 | above |
| 306 | i.e. |
| 308 | conclude |
| 310 | agricultural |
| 311 | moreover |
| 312 | rapidly |
| 314 | approximately |
| 321 | precisely |
| 323 | domestic |
| 326 | actual |
| 341 | largely |
| 347 | enable |
| 352 | substantial |
| 357 | external |
| 361 | subsequent |
| 383 | considerable |
| 388 | respectively |
| 392 | german |
| 396 | constitute |
| 399 | acquire |
| 414 | collective |
| 419 | furthermore |
| 424 | characterize |
| 432 | extensive |
| 433 | biological |
| 436 | widely |
| 438 | merely |
| 446 | explicitly |
| 449 | nevertheless |
| 452 | intellectual |
| 463 | strongly |
| 472 | racial |
| 476 | mutual |
| 486 | assign |
| 487 | asia |
| 502 | hence |
| 504 | give |
| 509 | ethical |
| 514 | residential |
| 517 | simultaneously |
| 518 | possess |

8055
deck.json Normal file

File diff suppressed because one or more lines are too long

320
script/avl.csv Normal file
View File

@ -0,0 +1,320 @@
201 unit
202 total
203 complex
204 employ
205 promote
206 literature
207 procedure
208 appropriate
209 estimate
210 negative
211 characteristic
212 typically
213 challenge
214 principle
215 element
216 ethnic
217 depend
218 creation
219 integration
220 aspect
221 publish
222 perspective
223 basic
224 belief
225 technique
226 outcome
227 explore
228 distribution
229 future
230 importance
231 independent
232 initial
233 feature
234 desire
235 following
236 alternative
237 consistent
238 basis
239 contrast
240 obtain
241 regarding
242 european
243 distinction
244 express
245 variety
246 broad
247 component
248 frequently
249 assume
250 additional
251 tool
252 predict
253 internal
254 labor
255 engage
256 separate
257 highly
258 rely
259 assess
260 objective
261 encourage
262 adopt
263 view
264 stability
265 client
266 instrument
267 extend
268 construct
269 demand
270 vision
271 propose
272 efficiency
273 solution
274 multiple
275 conclusion
276 overall
277 presence
278 claim
279 transform
280 generate
281 failure
282 advance
283 connection
284 journal
285 initiative
286 enhance
287 accurate
288 facility
289 urban
290 protection
291 extent
292 mental
293 consequence
294 institute
295 content
296 device
297 scholar
298 visual
299 above
300 unique
301 difficulty
302 discipline
303 sustain
304 capacity
305 perceive
306 i.e.
307 ensure
308 conclude
309 combination
310 agricultural
311 moreover
312 emphasize
313 rapidly
314 approximately
315 acceptance
316 sector
317 commitment
318 experiment
319 implication
320 evaluate
321 precisely
322 notion
323 domestic
324 restriction
325 consist
326 actual
327 formal
328 industrial
329 revolution
330 fundamental
331 essential
332 adapt
333 contact
334 colleague
335 dimension
336 account
337 statistics
338 theme
339 locate
340 adequate
341 largely
342 ideal
343 philosophy
344 minority
345 hypothesis
346 psychological
347 enable
348 trend
349 exchange
350 percentage
351 sufficient
352 substantial
353 explanation
354 emotional
355 preference
356 calculate
357 external
358 code
359 flow
360 transition
361 subsequent
362 phase
363 rural
364 intensity
365 monitor
366 competitive
367 core
368 volume
369 framework
370 incorporate
371 encounter
372 cite
373 attribute
374 emphasis
375 waste
376 climate
377 differ
378 technical
379 mechanism
380 description
381 assert
382 assistance
383 considerable
384 modify
385 isolation
386 territory
387 origin
388 respectively
389 judgement
390 cycle
391 assumption
392 german
393 illustrate
394 justify
395 manner
396 constitute
397 phenomenon
398 relevant
399 acquire
400 correspond
401 planning
402 error
403 household
404 practical
405 professional
406 theoretical
407 summary
408 depression
409 sequence
410 consideration
411 derive
412 arise
413 radical
414 collective
415 recognition
416 proportion
417 mode
418 resistance
419 furthermore
420 diversity
421 anxiety
422 logic
423 whole
424 characterize
425 cooperation
426 dominate
427 implementation
428 universal
429 significance
430 resolution
431 numerous
432 extensive
433 biological
434 display
435 publication
436 widely
437 permit
438 merely
439 joint
440 comprehensive
441 alter
442 insight
443 document
444 imply
445 absence
446 explicitly
447 conventional
448 index
449 nevertheless
450 facilitate
451 evolution
452 intellectual
453 govern
454 signal
455 passage
456 discovery
457 introduction
458 boundary
459 gain
460 yield
461 decline
462 ratio
463 strongly
464 crucial
465 settlement
466 resolve
467 distinguish
468 independence
469 formation
470 transmission
471 shape
472 racial
473 detect
474 poverty
475 intention
476 mutual
477 evolve
478 shift
479 progressive
480 flexibility
481 domain
482 profession
483 apparent
484 coordinate
485 constrain
486 assign
487 asia
488 electronic
489 exception
490 visible
491 norm
492 adjust
493 consumption
494 symbol
495 dominant
496 barrier
497 motor
498 entry
499 underlie
500 bias
501 discriminate
502 hence
503 guide
504 give
505 dialogue
506 manufacture
507 enterprise
508 scope
509 ethical
510 province
511 retain
512 capability
513 revision
514 residential
515 expansion
516 strengthen
517 simultaneously
518 possess
519 manifest
520 incentive
1 201 unit
2 202 total
3 203 complex
4 204 employ
5 205 promote
6 206 literature
7 207 procedure
8 208 appropriate
9 209 estimate
10 210 negative
11 211 characteristic
12 212 typically
13 213 challenge
14 214 principle
15 215 element
16 216 ethnic
17 217 depend
18 218 creation
19 219 integration
20 220 aspect
21 221 publish
22 222 perspective
23 223 basic
24 224 belief
25 225 technique
26 226 outcome
27 227 explore
28 228 distribution
29 229 future
30 230 importance
31 231 independent
32 232 initial
33 233 feature
34 234 desire
35 235 following
36 236 alternative
37 237 consistent
38 238 basis
39 239 contrast
40 240 obtain
41 241 regarding
42 242 european
43 243 distinction
44 244 express
45 245 variety
46 246 broad
47 247 component
48 248 frequently
49 249 assume
50 250 additional
51 251 tool
52 252 predict
53 253 internal
54 254 labor
55 255 engage
56 256 separate
57 257 highly
58 258 rely
59 259 assess
60 260 objective
61 261 encourage
62 262 adopt
63 263 view
64 264 stability
65 265 client
66 266 instrument
67 267 extend
68 268 construct
69 269 demand
70 270 vision
71 271 propose
72 272 efficiency
73 273 solution
74 274 multiple
75 275 conclusion
76 276 overall
77 277 presence
78 278 claim
79 279 transform
80 280 generate
81 281 failure
82 282 advance
83 283 connection
84 284 journal
85 285 initiative
86 286 enhance
87 287 accurate
88 288 facility
89 289 urban
90 290 protection
91 291 extent
92 292 mental
93 293 consequence
94 294 institute
95 295 content
96 296 device
97 297 scholar
98 298 visual
99 299 above
100 300 unique
101 301 difficulty
102 302 discipline
103 303 sustain
104 304 capacity
105 305 perceive
106 306 i.e.
107 307 ensure
108 308 conclude
109 309 combination
110 310 agricultural
111 311 moreover
112 312 emphasize
113 313 rapidly
114 314 approximately
115 315 acceptance
116 316 sector
117 317 commitment
118 318 experiment
119 319 implication
120 320 evaluate
121 321 precisely
122 322 notion
123 323 domestic
124 324 restriction
125 325 consist
126 326 actual
127 327 formal
128 328 industrial
129 329 revolution
130 330 fundamental
131 331 essential
132 332 adapt
133 333 contact
134 334 colleague
135 335 dimension
136 336 account
137 337 statistics
138 338 theme
139 339 locate
140 340 adequate
141 341 largely
142 342 ideal
143 343 philosophy
144 344 minority
145 345 hypothesis
146 346 psychological
147 347 enable
148 348 trend
149 349 exchange
150 350 percentage
151 351 sufficient
152 352 substantial
153 353 explanation
154 354 emotional
155 355 preference
156 356 calculate
157 357 external
158 358 code
159 359 flow
160 360 transition
161 361 subsequent
162 362 phase
163 363 rural
164 364 intensity
165 365 monitor
166 366 competitive
167 367 core
168 368 volume
169 369 framework
170 370 incorporate
171 371 encounter
172 372 cite
173 373 attribute
174 374 emphasis
175 375 waste
176 376 climate
177 377 differ
178 378 technical
179 379 mechanism
180 380 description
181 381 assert
182 382 assistance
183 383 considerable
184 384 modify
185 385 isolation
186 386 territory
187 387 origin
188 388 respectively
189 389 judgement
190 390 cycle
191 391 assumption
192 392 german
193 393 illustrate
194 394 justify
195 395 manner
196 396 constitute
197 397 phenomenon
198 398 relevant
199 399 acquire
200 400 correspond
201 401 planning
202 402 error
203 403 household
204 404 practical
205 405 professional
206 406 theoretical
207 407 summary
208 408 depression
209 409 sequence
210 410 consideration
211 411 derive
212 412 arise
213 413 radical
214 414 collective
215 415 recognition
216 416 proportion
217 417 mode
218 418 resistance
219 419 furthermore
220 420 diversity
221 421 anxiety
222 422 logic
223 423 whole
224 424 characterize
225 425 cooperation
226 426 dominate
227 427 implementation
228 428 universal
229 429 significance
230 430 resolution
231 431 numerous
232 432 extensive
233 433 biological
234 434 display
235 435 publication
236 436 widely
237 437 permit
238 438 merely
239 439 joint
240 440 comprehensive
241 441 alter
242 442 insight
243 443 document
244 444 imply
245 445 absence
246 446 explicitly
247 447 conventional
248 448 index
249 449 nevertheless
250 450 facilitate
251 451 evolution
252 452 intellectual
253 453 govern
254 454 signal
255 455 passage
256 456 discovery
257 457 introduction
258 458 boundary
259 459 gain
260 460 yield
261 461 decline
262 462 ratio
263 463 strongly
264 464 crucial
265 465 settlement
266 466 resolve
267 467 distinguish
268 468 independence
269 469 formation
270 470 transmission
271 471 shape
272 472 racial
273 473 detect
274 474 poverty
275 475 intention
276 476 mutual
277 477 evolve
278 478 shift
279 479 progressive
280 480 flexibility
281 481 domain
282 482 profession
283 483 apparent
284 484 coordinate
285 485 constrain
286 486 assign
287 487 asia
288 488 electronic
289 489 exception
290 490 visible
291 491 norm
292 492 adjust
293 493 consumption
294 494 symbol
295 495 dominant
296 496 barrier
297 497 motor
298 498 entry
299 499 underlie
300 500 bias
301 501 discriminate
302 502 hence
303 503 guide
304 504 give
305 505 dialogue
306 506 manufacture
307 507 enterprise
308 508 scope
309 509 ethical
310 510 province
311 511 retain
312 512 capability
313 513 revision
314 514 residential
315 515 expansion
316 516 strengthen
317 517 simultaneously
318 518 possess
319 519 manifest
320 520 incentive

570
script/cards.csv Normal file

File diff suppressed because one or more lines are too long

570
script/cards_cleaned.csv Normal file

File diff suppressed because one or more lines are too long

133
script/gethtml Executable file
View File

@ -0,0 +1,133 @@
#!/bin/zsh
# This script is just the initial step in porting the avl data to anki
# It does not work 100% reliable, but it's good enough for me
# Manual adjustments on the exported data should be expected
# This script is rather ugly in its Implementation
[ -f cards.csv ] && rm cards.csv
touch cards.csv
handle_cleaned_html(){
## Handling of multiple meanings
meaning_clean_func="$1"
no_func="$2"
word_func="$3"
pof_func="$4"
if echo "$meaning_clean_func" | grep -q "SUP"; then
count_func="$(echo "$meaning_clean_func" | grep -c "SUP")"
echo "Found $count_func multiple meanings for the word $word_func ($pof_func)"
for j in {1..$count_func}; do
meaning="$( echo "$meaning_clean_func" | grep "<SUP> $j" | perl -pe "s|<P> <SUP> $j </SUP> <TT> (.*?) </TT> </P>|\1|" )"
echo MEANING:
echo "$meaning"
echo DONE
[ -f categoriestmp ] && rm categoriestmp
touch categoriestmp
[ -f backofcardtmp ] && rm backofcardtmp
touch backofcardtmp
printout=false
while IFS= read -r line; do
if [ $printout = true ]; then
if echo "$line" | grep -q "<P> <SUP> $(( j + 1 )) </SUP>" ; then
printout=false
else
echo "$line" >> backofcardtmp
if echo "$line" | grep -q '<P> <U>'; then
echo "$line" | perl -pe "s|<P> <U> (.*?) </U>(.*?)$|<P> <U> \1 </U> </P>|" >> categoriestmp
fi
fi
fi
if echo "$line" | grep -q "SUP"; then
if echo "$line" | grep -q "<P> <SUP> $j </SUP>"; then
printout=true
fi
fi
done <<< $(echo "$meaning_clean_func")
echo BACKSIDE:
cat backofcardtmp
echo CATEGORIES
cat categoriestmp
echo DONE
categories="$(cat categoriestmp | tr '\n' ' ')"
backside="$(cat backofcardtmp | tr '\n' ' ' )"
printf "%s.%s;\"%s\";\"%s\";\"%s\";\"%s\";\"%s\"\n" "$no_func" "$j" "$word_func" "$meaning" "$categories" "$backside" "$pof_func" >> cards.csv
done
else
echo "Found only one meaning for the word $word_func ($pof_func)"
meaning="" #ozdic only provides meaning if there are different ones and the collocations for those different meanings differ.
[ -f categoriestmp ] && rm categoriestmp
touch categoriestmp
while IFS= read -r line; do
#echo current line:
#echo "$line"
echo "$line" | grep "<P> <U> " | perl -pe "s|<P> <U> (.*?) </U>(.*?)$|<P> <U> \1 </U> </P>|" >> categoriestmp
done <<< $(echo "$meaning_clean")
backside="$( echo "$meaning_clean_func" | grep "<P> <U> " | grep -vF '<DIV class="item"><P class="word"><B>'| tr '\n' ' ')"
#cat categoriestmp
categories="$(cat categoriestmp | tr '\n' ' ' )"
printf "%s;\"%s\";\"%s\";\"%s\";\"%s\";\"%s\"\n" "$no_func" "$word_func" "$meaning" "$categories" "$backside" "$pof_func" >> cards.csv
fi
#printf "%s;\"%s\";\"%s\"\n" "$no" "$word" "$meaning_clean" >> cards.csv
}
cat avl.csv | while read line || [[ -n $line ]];
do
no="$(echo "$line" | awk 'BEGIN{FS = "\t" } {print $1}' )"
word="$(echo "$line" | awk 'BEGIN{FS = "\t" } {print $2}' )"
printf "card:%s\t(%s)\n" "$no" "$word"
meaning_clean="$(curl --no-progress-meter http://www.ozdic.com/collocation-dictionary/"$word" | sed '6,35d; 38,48d' | tac | sed '6,30d' | tac | tr '\t' ' ' | tr -d '\r')"
#echo MEANINGCLEAN1:
#echo "$meaning_clean"
#echo DONE
echo "Checking whether there are multiple parts of speech..."
pof_count="$(echo "$meaning_clean" | grep -c "<P class=\"word\">" )"
if [ $pof_count -gt 1 ]; then
echo "$pof_count parts of speech found!"
# seperate into multiple "meaning_clean"
pofs="$( echo "$meaning_clean" | grep "<DIV class=\"item\"><P class=\"word\"><B>" | perl -pe "s|^.*?<DIV class=\"item\"><P class=\"word\"><B>[a-zA-Z]*? </B><I>([a-zA-Z\.]*?) </I> </P>|\1|g")"
#echo "pofs:"
#echo "$pofs"
pofs="$(printf "%s\nscript" "$pofs" )"
for i in {1..$pof_count}; do
current_pof=$( echo "$pofs" | sed "${i}q;d") #| sed 's/\./\\./g')
next_pof=$(echo "$pofs" | sed "$((i+1))q;d") #| sed 's/\./\\./g')
echo current_pof: $current_pof
echo next_pof: $next_pof
#start="<DIV class=\"item\"><P class=\"word\"><B>$word <\/B><I>$current_pof <\/I> <\/P>"
start="$( printf '<DIV class="item"><P class="word"><B>%s </B><I>%s </I> </P>\n' "$word" "$current_pof" )"
if [ $i -eq $pof_count ]; then
stop="$next_pof"
#echo MEANINGCLEAN_LASTPOF:
#echo "$meaning_clean"
#echo DONE
else
stop="$( printf '<DIV class="item"><P class="word"><B>%s </B><I>%s </I> </P>\n' "$word" "$next_pof" )"
fi
meaning_clean_temp="$( echo "$meaning_clean" | sed 's,'"$start"',START,')"
meaning_clean_temp="$( echo "$meaning_clean_temp" | sed 's,'"$stop"',STOP,' )"
echo MEANINGCLEAN_TEMP:
echo "$meaning_clean_temp"
echo DONE
pof_meaning_clean="$(echo "$meaning_clean_temp" | sed -n '/START/,/STOP/ { /START/n; /STOP/!p }' )"
echo POFMEANINGCLEAN
echo "$pof_meaning_clean"
echo DONE
notemp="$no"'.'"$i"
handle_cleaned_html "$pof_meaning_clean" "$notemp" "$word" "$current_pof"
done
else
echo "only one part of speech found"
pof="$( echo "$meaning_clean" | grep "<DIV class=\"item\"><P class=\"word\"><B>" | perl -pe "s|^.*?<DIV class=\"item\"><P class=\"word\"><B>[a-zA-Z]*? </B><I>([a-zA-Z\.]*?) </I> </P>|\1|g")"
echo POF:"$pof"
handle_cleaned_html "$meaning_clean" "$no" "$word" "$pof"
fi
done
rm backofcardtmp categoriestmp

Binary file not shown.

Before

Width:  |  Height:  |  Size: 426 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 58 KiB