-
Notifications
You must be signed in to change notification settings - Fork 0
/
ld65.html
1320 lines (1140 loc) · 52.9 KB
/
ld65.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
<LINK REL="stylesheet" TYPE="text/css" HREF="doc.css">
<META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.82">
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<TITLE>ld65 Users Guide</TITLE>
</HEAD>
<BODY>
<H1>ld65 Users Guide</H1>
<H2>
<A HREF="mailto:[email protected]">Ullrich von Bassewitz</A></H2>
<HR>
<EM>The ld65 linker combines object files into an executable file. ld65 is highly
configurable and uses configuration files for high flexibility.</EM>
<HR>
<P>
<H2><A NAME="toc1">1.</A> <A HREF="ld65.html#s1">Overview</A></H2>
<P>
<H2><A NAME="toc2">2.</A> <A HREF="ld65.html#s2">Usage</A></H2>
<UL>
<LI><A NAME="toc2.1">2.1</A> <A HREF="ld65.html#ss2.1">Command-line option overview</A>
<LI><A NAME="toc2.2">2.2</A> <A HREF="ld65.html#ss2.2">Command-line options in detail</A>
</UL>
<P>
<H2><A NAME="toc3">3.</A> <A HREF="ld65.html#s3">Search paths</A></H2>
<UL>
<LI><A NAME="toc3.1">3.1</A> <A HREF="ld65.html#ss3.1">Library search path</A>
<LI><A NAME="toc3.2">3.2</A> <A HREF="ld65.html#ss3.2">Object file search path</A>
<LI><A NAME="toc3.3">3.3</A> <A HREF="ld65.html#ss3.3">Config file search path</A>
</UL>
<P>
<H2><A NAME="toc4">4.</A> <A HREF="ld65.html#s4">Detailed workings</A></H2>
<P>
<H2><A NAME="toc5">5.</A> <A HREF="ld65.html#s5">Configuration files</A></H2>
<UL>
<LI><A NAME="toc5.1">5.1</A> <A HREF="ld65.html#ss5.1">Memory areas</A>
<LI><A NAME="toc5.2">5.2</A> <A HREF="ld65.html#ss5.2">Segments</A>
<LI><A NAME="toc5.3">5.3</A> <A HREF="ld65.html#ss5.3">Output files</A>
<LI><A NAME="toc5.4">5.4</A> <A HREF="ld65.html#ss5.4">OVERWRITE segments</A>
<LI><A NAME="toc5.5">5.5</A> <A HREF="ld65.html#ss5.5">LOAD and RUN addresses (ROMable code)</A>
<LI><A NAME="toc5.6">5.6</A> <A HREF="ld65.html#ss5.6">Other MEMORY area attributes</A>
<LI><A NAME="toc5.7">5.7</A> <A HREF="ld65.html#ss5.7">Other SEGMENT attributes</A>
<LI><A NAME="toc5.8">5.8</A> <A HREF="ld65.html#ss5.8">The FILES section</A>
<LI><A NAME="toc5.9">5.9</A> <A HREF="ld65.html#ss5.9">The FORMAT section</A>
<LI><A NAME="toc5.10">5.10</A> <A HREF="ld65.html#ss5.10">The FEATURES section</A>
<LI><A NAME="toc5.11">5.11</A> <A HREF="ld65.html#ss5.11">The SYMBOLS section</A>
</UL>
<P>
<H2><A NAME="toc6">6.</A> <A HREF="ld65.html#s6">Special segments</A></H2>
<UL>
<LI><A NAME="toc6.1">6.1</A> <A HREF="ld65.html#ss6.1">INIT</A>
<LI><A NAME="toc6.2">6.2</A> <A HREF="ld65.html#ss6.2">LOWCODE</A>
<LI><A NAME="toc6.3">6.3</A> <A HREF="ld65.html#ss6.3">ONCE</A>
<LI><A NAME="toc6.4">6.4</A> <A HREF="ld65.html#ss6.4">STARTUP</A>
<LI><A NAME="toc6.5">6.5</A> <A HREF="ld65.html#ss6.5">ZPSAVE</A>
</UL>
<P>
<H2><A NAME="toc7">7.</A> <A HREF="ld65.html#s7">Copyright</A></H2>
<HR>
<H2><A NAME="s1">1.</A> <A HREF="#toc1">Overview</A></H2>
<P>The ld65 linker combines several object modules created by the ca65
assembler, producing an executable file. The object modules may be read
from a library created by the ar65 archiver (this is somewhat faster and
more convenient). The linker was designed to be as flexible as possible.
It complements the features that are built into the ca65 macroassembler:</P>
<P>
<UL>
<LI> Accept any number of segments to form an executable module.
</LI>
<LI> Resolve arbitrary expressions stored in the object files.
</LI>
<LI> In case of errors, use the meta information stored in the object files
to produce helpful error messages. In case of undefined symbols,
expression range errors, or symbol type mismatches, ld65 is able to
tell you the exact location in the original assembler source, where
the symbol was referenced.
</LI>
<LI> Flexible output. The output of ld65 is highly configurable by a config
file. Some more-common platforms are supported by default configurations
that may be activated by naming the target system. The output
generation was designed with different output formats in mind, so
adding other formats shouldn't be a great problem.
</LI>
</UL>
</P>
<H2><A NAME="s2">2.</A> <A HREF="#toc2">Usage</A></H2>
<H2><A NAME="ss2.1">2.1</A> <A HREF="#toc2.1">Command-line option overview</A>
</H2>
<P>The linker is called as follows:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
---------------------------------------------------------------------------
Usage: ld65 [options] module ...
Short options:
-( Start a library group
-) End a library group
-C name Use linker config file
-D sym=val Define a symbol
-L path Specify a library search path
-Ln name Create a VICE label file
-S addr Set the default start address
-V Print the linker version
-h Help (this text)
-m name Create a map file
-o name Name the default output file
-t sys Set the target system
-u sym Force an import of symbol 'sym'
-v Verbose mode
-vm Verbose map file
Long options:
--allow-multiple-definition Allow multiple definitions
--cfg-path path Specify a config file search path
--config name Use linker config file
--dbgfile name Generate debug information
--define sym=val Define a symbol
--end-group End a library group
--force-import sym Force an import of symbol 'sym'
--help Help (this text)
--large-alignment Don't warn about large alignments
--lib file Link this library
--lib-path path Specify a library search path
--mapfile name Create a map file
--module-id id Specify a module id
--obj file Link this object file
--obj-path path Specify an object file search path
--start-addr addr Set the default start address
--start-group Start a library group
--target sys Set the target system
--version Print the linker version
--warnings-as-errors Treat warnings as errors
---------------------------------------------------------------------------
</PRE>
</CODE></BLOCKQUOTE>
</P>
<H2><A NAME="ss2.2">2.2</A> <A HREF="#toc2.2">Command-line options in detail</A>
</H2>
<P>Here is a description of all of the command-line options:</P>
<P>
<DL>
<DT><B><CODE>--allow-multiple-definition</CODE></B><DD>
<P>Normally when a global symbol is defined multiple times, ld65 will
issue an error and not create the output file. This option lets it
silently ignore this fact and continue. The first definition of a
symbol will be used.</P>
<P>
<A NAME="option--start-group"></A> </P>
<DT><B><CODE>-(, --start-group</CODE></B><DD>
<P>Start a library group. The libraries specified within a group are searched
multiple times to resolve crossreferences within the libraries. Normally,
crossreferences are resolved only within a library, that is the library is
searched multiple times. Libraries specified later on the command line
cannot reference otherwise unreferenced symbols in libraries specified
earlier, because the linker has already handled them. Library groups are
a solution for this problem, because the linker will search repeatedly
through all libraries specified in the group, until all possible open
symbol references have been satisfied.</P>
<DT><B><CODE>-), --end-group</CODE></B><DD>
<P>End a library group. See the explanation of the <CODE>
<A HREF="#option--start-group">--start-group</A></CODE> option.</P>
<DT><B><CODE>-h, --help</CODE></B><DD>
<P>Print the short option summary shown above.</P>
<P>
<A NAME="option-m"></A> </P>
<DT><B><CODE>-m name, --mapfile name</CODE></B><DD>
<P>This option (which needs an argument that will used as a filename for
the generated map file) will cause the linker to generate a map file.
The map file does contain a detailed overview over the modules used, the
sizes for the different segments, and a table containing exported
symbols.</P>
<P>
<A NAME="option-o"></A> </P>
<DT><B><CODE>-o name</CODE></B><DD>
<P>The -o switch is used to give the name of the default output file.
Depending on your output configuration, this name <EM>might not</EM> be used as the
name for the output file. However, for the default configurations, this
name is used for the output file name.</P>
<P>
<A NAME="option-t"></A> </P>
<DT><B><CODE>-t sys, --target sys</CODE></B><DD>
<P>The argument for the -t switch is the name of the target system. Since this
switch will activate a default configuration, it may not be used together
with the <CODE>
<A HREF="#option-C">-C</A></CODE> option. The following target
systems are currently supported:</P>
<P>
<UL>
<LI>none</LI>
<LI>module</LI>
<LI>apple2</LI>
<LI>apple2enh</LI>
<LI>atari2600</LI>
<LI>atari7800</LI>
<LI>atari</LI>
<LI>atarixl</LI>
<LI>atmos</LI>
<LI>c16 (works also for the c116 with memory up to 32K)</LI>
<LI>c64</LI>
<LI>c128</LI>
<LI>cbm510 (CBM-II series with 40-column video)</LI>
<LI>cbm610 (all CBM series-II computers with 80-column video)</LI>
<LI>geos-apple</LI>
<LI>geos-cbm</LI>
<LI>lunix</LI>
<LI>lynx</LI>
<LI>nes</LI>
<LI>pet (all CBM PET systems except the 2001)</LI>
<LI>plus4</LI>
<LI>sim6502</LI>
<LI>sim65c02</LI>
<LI>supervision</LI>
<LI>telestrat</LI>
<LI>vic20</LI>
</UL>
</P>
<P>There are a few more targets defined but neither of them is actually
supported.</P>
<DT><B><CODE>-u sym[:addrsize], --force-import sym[:addrsize]</CODE></B><DD>
<P>Force an import of a symbol. While object files are always linked to the
output file, regardless if there are any references, object modules from
libraries get only linked in if an import can be satisfied by this module.
The <CODE>--force-import</CODE> option may be used to add a reference to a symbol and
as a result force linkage of the module that exports the identifier.</P>
<P>The name of the symbol may optionally be followed by a colon and an address-size
specifier. If no address size is specified, the default address size
for the target machine is used.</P>
<P>Please note that the symbol name needs to have the internal representation,
meaning you have to prepend an underscore for C identifiers.</P>
<P>
<A NAME="option-v"></A> </P>
<DT><B><CODE>-v, --verbose</CODE></B><DD>
<P>Using the -v option, you may enable more output that may help you to
locate problems. If an undefined symbol is encountered, -v causes the
linker to print a detailed list of the references (that is, source file
and line) for this symbol.</P>
<DT><B><CODE>-vm</CODE></B><DD>
<P>Must be used in conjunction with <CODE>
<A HREF="#option-m">-m</A></CODE>
(generate map file). Normally the map file will not include empty segments
and sections, or unreferenced symbols. Using this option, you can force the
linker to include all that information into the map file. Also, it will
include a second <CODE>Exports</CODE> list. The first list is sorted by name;
the second one is sorted by value.</P>
<P>
<A NAME="option-C"></A> </P>
<DT><B><CODE>-C</CODE></B><DD>
<P>This gives the name of an output config file to use. See section 4 for more
information about config files. -C may not be used together with <CODE>
<A HREF="#option-t">-t</A></CODE>.</P>
<P>
<A NAME="option-D"></A> </P>
<DT><B><CODE>-D sym=value, --define sym=value</CODE></B><DD>
<P>This option allows to define an external symbol on the command line. Value
may start with a '$' sign or with <CODE>0x</CODE> for hexadecimal values,
otherwise a leading zero denotes octal values. See also
<A HREF="#SYMBOLS">the SYMBOLS section</A> in the configuration file.</P>
<P>
<A NAME="option--lib-path"></A> </P>
<DT><B><CODE>-L path, --lib-path path</CODE></B><DD>
<P>Specify a library search path. This option may be used more than once. It
adds a directory to the search path for library files. Libraries specified
without a path are searched in the current directory, in the list of
directories specified using <CODE>--lib-path</CODE>, in directories given by
environment variables, and in a built-in default directory.</P>
<DT><B><CODE>-Ln</CODE></B><DD>
<P>This option allows you to create a file that contains all global labels and
may be loaded into the VICE emulator using the <CODE>ll</CODE> (load label) command
or into the Oricutron emulator using the <CODE>sl</CODE> (symbols load) command. You
may use this to debug your code with VICE. Note: Older versions had some
bugs in the label code. If you have problems, please get the latest
<A HREF="http://vice-emu.sourceforge.net">VICE</A> version.</P>
<P>
<A NAME="option-S"></A> </P>
<DT><B><CODE>-S addr, --start-addr addr</CODE></B><DD>
<P>Using -S you may define the default starting address. If and how this
address is used depends on the config file in use. For the default
configurations, only the "none", "apple2" and "apple2enh" systems honor an
explicit start address, all other default configs provide their own.</P>
<DT><B><CODE>-V, --version</CODE></B><DD>
<P>This option prints the version number of the linker. If you send any
suggestions or bugfixes, please include this number.</P>
<P>
<A NAME="option--cfg-path"></A> </P>
<DT><B><CODE>--cfg-path path</CODE></B><DD>
<P>Specify a config file search path. This option may be used more than once.
It adds a directory to the search path for config files. A config file given
with the <CODE>
<A HREF="#option-C">-C</A></CODE> option that has no path in
its name is searched in the current directory, in the list of directories
specified using <CODE>--cfg-path</CODE>, in directories given by environment variables,
and in a built-in default directory.</P>
<P>
<A NAME="option--dbgfile"></A> </P>
<DT><B><CODE>--dbgfile name</CODE></B><DD>
<P>Specify an output file for debug information. Available information will be
written to this file. Using the <CODE>-g</CODE> option for the compiler and assembler
will increase the amount of information available. Please note that debug
information generation is currently being developed, so the format of the
file and its contents are subject to change without further notice.</P>
<P>
<A NAME="option--large-alignment"></A> </P>
<DT><B><CODE>--large-alignment</CODE></B><DD>
<P>Disable warnings about a large combined alignment. See the discussion of the
<CODE>.ALIGN</CODE> directive in the ca65 Users Guide for further information.</P>
<DT><B><CODE>--lib file</CODE></B><DD>
<P>Links a library to the output. Use this command-line option instead of just
naming the library file, if the linker is not able to determine the file
type because of an unusual extension.</P>
<DT><B><CODE>--obj file</CODE></B><DD>
<P>Links an object file to the output. Use this command-line option instead
of just naming the object file, if the linker is not able to determine the
file type because of an unusual extension.</P>
<P>
<A NAME="option--obj-path"></A> </P>
<DT><B><CODE>--obj-path path</CODE></B><DD>
<P>Specify an object file search path. This option may be used more than once.
It adds a directory to the search path for object files. An object file
passed to the linker that has no path in its name is searched in the current
directory, in the list of directories specified using <CODE>--obj-path</CODE>, in
directories given by environment variables, and in a built-in default directory.</P>
<P>
<A NAME="option--warnings-as-errors"></A> </P>
<DT><B><CODE>--warnings-as-errors</CODE></B><DD>
<P>An error will be generated if any warnings were produced.</P>
</DL>
</P>
<H2><A NAME="s3">3.</A> <A HREF="#toc3">Search paths</A></H2>
<P>Starting with version 2.10, there are now several search-path lists for files needed
by the linker: one for libraries, one for object files, and one for config
files.</P>
<H2><A NAME="ss3.1">3.1</A> <A HREF="#toc3.1">Library search path</A>
</H2>
<P>The library search-path list contains in this order:</P>
<P>
<OL>
<LI>The current directory.</LI>
<LI>Any directory added with the <CODE>
<A HREF="#option--lib-path">--lib-path</A></CODE> option on the command line.</LI>
<LI>The value of the environment variable <CODE>LD65_LIB</CODE> if it is defined.</LI>
<LI>A subdirectory named <CODE>lib</CODE> of the directory defined in the environment
variable <CODE>CC65_HOME</CODE>, if it is defined.</LI>
<LI>An optionally compiled-in library path.</LI>
</OL>
</P>
<H2><A NAME="ss3.2">3.2</A> <A HREF="#toc3.2">Object file search path</A>
</H2>
<P>The object file search-path list contains in this order:</P>
<P>
<OL>
<LI>The current directory.</LI>
<LI>Any directory added with the <CODE>
<A HREF="#option--obj-path">--obj-path</A></CODE> option on the command line.</LI>
<LI>The value of the environment variable <CODE>LD65_OBJ</CODE> if it is defined.</LI>
<LI>A subdirectory named <CODE>obj</CODE> of the directory defined in the environment
variable <CODE>CC65_HOME</CODE>, if it is defined.</LI>
<LI>An optionally compiled-in directory.</LI>
</OL>
</P>
<H2><A NAME="ss3.3">3.3</A> <A HREF="#toc3.3">Config file search path</A>
</H2>
<P>The config file search-path list contains in this order:</P>
<P>
<OL>
<LI>The current directory.</LI>
<LI>Any directory added with the <CODE>
<A HREF="#option--cfg-path">--cfg-path</A></CODE> option on the command line.</LI>
<LI>The value of the environment variable <CODE>LD65_CFG</CODE> if it is defined.</LI>
<LI>A subdirectory named <CODE>cfg</CODE> of the directory defined in the environment
variable <CODE>CC65_HOME</CODE>, if it is defined.</LI>
<LI>An optionally compiled-in directory.</LI>
</OL>
</P>
<H2><A NAME="s4">4.</A> <A HREF="#toc4">Detailed workings</A></H2>
<P>The linker does several things when combining object modules:</P>
<P>First, the command line is parsed from left to right. For each object file
encountered (object files are recognized by a magic word in the header, so
the linker does not care about the name), imported and exported
identifiers are read from the file and inserted in a table. If a library
name is given (libraries are also recognized by a magic word, there are no
special naming conventions), all modules in the library are checked if an
export from this module would satisfy an import from other modules. All
modules where this is the case are marked. If duplicate identifiers are
found, the linker issues warnings.</P>
<P>That procedure (parsing and reading from left to right) does mean that a
library may only satisfy references for object modules (given directly or from
a library) named <EM>before</EM> that library. With the command line</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
ld65 crt0.o clib.lib test.o
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>the module <CODE>test.o</CODE> must not contain references to modules in the library
<CODE>clib.lib</CODE>. But, if it does, you have to change the order of the modules
on the command line:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
ld65 crt0.o test.o clib.lib
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>Step two is, to read the configuration file, and assign start addresses
for the segments and define any linker symbols (see
<A HREF="#config-files">Configuration files</A>).</P>
<P>After that, the linker is ready to produce an output file. Before doing that,
it checks its data for consistency. That is, it checks for unresolved
externals (if the output format is not relocatable) and for symbol type
mismatches (for example a zero-page symbol is imported by a module as an absolute
symbol).</P>
<P>Step four is, to write the actual target files. In this step, the linker will
resolve any expressions contained in the segment data. Circular references are
also detected in this step (a symbol may have a circular reference that goes
unnoticed if the symbol is not used).</P>
<P>Step five is to output a map file with a detailed list of all modules,
segments and symbols encountered.</P>
<P>And, last step, if you give the <CODE>
<A HREF="#option-v">-v</A></CODE> switch
twice, you get a dump of the segment data. However, this may be quite
unreadable if you're not a developer. :-)</P>
<H2><A NAME="config-files"></A> <A NAME="s5">5.</A> <A HREF="#toc5">Configuration files</A></H2>
<P>Configuration files are used to describe the layout of the output file(s). Two
major topics are covered in a config file: The memory layout of the target
architecture, and the assignment of segments to memory areas. In addition,
several other attributes may be specified.</P>
<P>Case is ignored for keywords, that is, section or attribute names, but it is
<EM>not</EM> ignored for names and strings.</P>
<H2><A NAME="ss5.1">5.1</A> <A HREF="#toc5.1">Memory areas</A>
</H2>
<P>Memory areas are specified in a <CODE>MEMORY</CODE> section. Let's have a look at an
example (this one describes the usable memory layout of the C64):</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
MEMORY {
RAM1: start = $0800, size = $9800;
ROM1: start = $A000, size = $2000;
RAM2: start = $C000, size = $1000;
ROM2: start = $E000, size = $2000;
}
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>As you can see, there are two RAM areas and two ROM areas. The names
(before the colon) are arbitrary names that must start with a letter, with
the remaining characters being letters or digits. The names of the memory
areas are used when assigning segments. As mentioned above, case is
significant for those names.</P>
<P>The syntax above is used in all sections of the config file. The name
(<CODE>ROM1</CODE> etc.) is said to be an identifier, the remaining tokens up to the
semicolon specify attributes for this identifier. You may use the equal sign
to assign values to attributes, and you may use a comma to separate
attributes, you may also leave both out. But you <EM>must</EM> use a semicolon to
mark the end of the attributes for one identifier. The section above may also
have looked like this:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
# Start of memory section
MEMORY
{
RAM1:
start $0800
size $9800;
ROM1:
start $A000
size $2000;
RAM2:
start $C000
size $1000;
ROM2:
start $E000
size $2000;
}
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>There are of course more attributes for a memory section than just start and
size. Start and size are mandatory attributes, that means, each memory area
defined <EM>must</EM> have these attributes given (the linker will check that). I
will cover other attributes later. As you may have noticed, I've used a
comment in the example above. Comments start with a hash mark ('#'), the
remainder of the line is ignored if this character is found.</P>
<H2><A NAME="ss5.2">5.2</A> <A HREF="#toc5.2">Segments</A>
</H2>
<P>Let's assume you have written a program for your trusty old C64, and you would
like to run it. For testing purposes, it should run in the <CODE>RAM</CODE> area. So
we will start to assign segments to memory sections in the <CODE>SEGMENTS</CODE>
section:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
SEGMENTS {
CODE: load = RAM1, type = ro;
RODATA: load = RAM1, type = ro;
DATA: load = RAM1, type = rw;
BSS: load = RAM1, type = bss, define = yes;
}
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>What we are doing here is telling the linker, that all segments go into the
<CODE>RAM1</CODE> memory area in the order specified in the <CODE>SEGMENTS</CODE> section. So
the linker will first write the <CODE>CODE</CODE> segment, then the <CODE>RODATA</CODE>
segment, then the <CODE>DATA</CODE> segment - but it will not write the <CODE>BSS</CODE>
segment. Why? Here enters the segment type: For each segment specified, you may also
specify a segment attribute. There are five possible segment attributes:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
ro means readonly
rw means read/write
bss means that this is an uninitialized segment
zp a zeropage segment
overwrite a segment that overwrites (parts of) another one
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>So, because we specified that the segment with the name BSS is of type bss,
the linker knows that this is uninitialized data, and will not write it to an
output file. This is an important point: For the assembler, the <CODE>BSS</CODE>
segment has no special meaning. You specify, which segments have the bss
attribute when linking. This approach is much more flexible than having one
fixed bss segment, and is a result of the design decision to supporting an
arbitrary segment count.</P>
<P>If you specify "<CODE>type = bss</CODE>" for a segment, the linker will make sure that
this segment does only contain uninitialized data (that is, zeroes), and issue
a warning if this is not the case.</P>
<P>For a <CODE>bss</CODE> type segment to be useful, it must be cleared somehow by your
program (this happens usually in the startup code - for example the startup
code for cc65-generated programs takes care about that). But how does your
code know, where the segment starts, and how big it is? The linker is able to
give that information, but you must request it. This is, what we're doing with
the "<CODE>define = yes</CODE>" attribute in the <CODE>BSS</CODE> definitions. For each
segment, where this attribute is true, the linker will export three symbols.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
__NAME_LOAD__ This is set to the address where the
segment is loaded.
__NAME_RUN__ This is set to the run address of the
segment. We will cover run addresses
later.
__NAME_SIZE__ This is set to the segment size.
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>Replace <CODE>NAME</CODE> by the name of the segment, in the example above, this would
be <CODE>BSS</CODE>. These symbols may be accessed by your code when imported using
the <CODE>.IMPORT</CODE> directive.</P>
<P>Now, as we've configured the linker to write the first three segments and
create symbols for the last one, there's only one question left: Where does
the linker put the data? It would be very convenient to have the data in a
file, wouldn't it?</P>
<H2><A NAME="ss5.3">5.3</A> <A HREF="#toc5.3">Output files</A>
</H2>
<P>We don't have any files specified above, and indeed, this is not needed in a
simple configuration like the one above. There is an additional attribute
"file" that may be specified for a memory area, that gives a file name to
write the area data into. If there is no file name given, the linker will
assign the default file name. This is "a.out" or the one given with the
<CODE>
<A HREF="#option-o">-o</A></CODE> option on the command line. Since the
default behaviour is OK for our purposes, I did not use the attribute in the
example above. Let's have a look at it now.</P>
<P>The "file" attribute (the keyword may also be written as "FILE" if you like
that better) takes a string enclosed in double quotes ('"') that specifies the
file, where the data is written. You may specify the same file several times,
in that case the data for all memory areas having this file name is written
into this file, in the order of the memory areas defined in the <CODE>MEMORY</CODE>
section. Let's specify some file names in the <CODE>MEMORY</CODE> section used above:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
MEMORY {
RAM1: start = $0800, size = $9800, file = %O;
ROM1: start = $A000, size = $2000, file = "rom1.bin";
RAM2: start = $C000, size = $1000, file = %O;
ROM2: start = $E000, size = $2000, file = "rom2.bin";
}
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>The <CODE>%O</CODE> used here is a way to specify the default behaviour explicitly:
<CODE>%O</CODE> is replaced by a string (including the quotes) that contains the
default output name, that is, "a.out" or the name specified with the <CODE>
<A HREF="#option-o">-o</A></CODE> option on the command line. Into this file, the
linker will first write any segments that go into <CODE>RAM1</CODE>, and will append
then the segments for <CODE>RAM2</CODE>, because the memory areas are given in this
order. So, for the RAM areas, nothing has really changed.</P>
<P>We've not used the ROM areas, but we will do that below, so we give the file
names here. Segments that go into <CODE>ROM1</CODE> will be written to a file named
"rom1.bin", and segments that go into <CODE>ROM2</CODE> will be written to a file
named "rom2.bin". The name given on the command line is ignored in both cases.</P>
<P>Assigning an empty file name for a memory area will discard the data written
to it. This is useful, if the memory area has segments assigned that are empty
(for example because they are of type bss). In that case, the linker will
create an empty output file. This may be suppressed by assigning an empty file
name to that memory area.</P>
<P>The <CODE>%O</CODE> sequence is also allowed inside a string. So using</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
MEMORY {
ROM1: start = $A000, size = $2000, file = "%O-1.bin";
ROM2: start = $E000, size = $2000, file = "%O-2.bin";
}
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>would write two files that start with the name of the output file specified on
the command line, with "-1.bin" and "-2.bin" appended respectively. Because
'%' is used as an escape char, the sequence "%%" has to be used if a single
percent sign is required.</P>
<H2><A NAME="ss5.4">5.4</A> <A HREF="#toc5.4">OVERWRITE segments</A>
</H2>
<P>There are situations when you may wish to overwrite some part (or parts) of a
segment with another one. Perhaps you are modifying an OS ROM that has its
public subroutines at fixed, well-known addresses, and you want to prevent them
from shifting to other locations in memory if your changed code takes less
space. Or you are updating a block of code available in binary-only form with
fixes that are scattered in various places. Generally, whenever you want to
minimize disturbance to an existing code brought on by your updates, OVERWRITE
segments are worth considering.</P>
<P>Here is an example:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
MEMORY {
RAM: file = "", start = $6000, size = $2000, type=rw;
ROM: file = %O, start = $8000, size = $8000, type=ro;
}
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>Nothing unusual so far, just two memory blocks - one RAM, one ROM. Now let's
look at the segment configuration:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
SEGMENTS {
RAM: load = RAM, type = bss;
ORIGINAL: load = ROM, type = ro;
FASTCOPY: load = ROM, start=$9000, type = overwrite;
JMPPATCH1: load = ROM, start=$f7e8, type = overwrite;
DEBUG: load = ROM, start=$8000, type = overwrite;
VERSION: load = ROM, start=$e5b7, type = overwrite;
}
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>Segment named ORIGINAL contains the original code, disassembled or provided in
a binary form (i.e. using <CODE>.INCBIN</CODE> directive; see the <CODE>ca65</CODE> assembler
document). Subsequent four segments will be relocated to addresses specified
by their "start" attributes ("offset" can also be used) and then will overwrite
whatever was at these locations in the ORIGINAL segment. In the end, resulting
binary output file will thus contain original data with the exception of four
sequences starting at $9000, $f7e8, $8000 and $e5b7, which will sport code from
their respective segments. How long these sequences will be depends on the
lengths of corresponding segments - they can even overlap, so think what you're
doing.</P>
<P>Finally, note that OVERWRITE segments should be the final segments loaded to a
particular memory area, and that they need at least one of "start" or "offset"
attributes specified.</P>
<H2><A NAME="ss5.5">5.5</A> <A HREF="#toc5.5">LOAD and RUN addresses (ROMable code)</A>
</H2>
<P>Let us look now at a more complex example. Say, you've successfully tested
your new "Super Operating System" (SOS for short) for the C64, and you
will now go and replace the ROMs by your own code. When doing that, you
face a new problem: If the code runs in RAM, we need not to care about
read/write data. But now, if the code is in ROM, we must care about it.
Remember the default segments (you may of course specify your own):</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
CODE read-only code
RODATA read-only data
DATA read/write data
BSS uninitialized data, read/write
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>Since <CODE>BSS</CODE> is not initialized, we must not care about it now, but what
about <CODE>DATA</CODE>? <CODE>DATA</CODE> contains initialized data, that is, data that was
explicitly assigned a value. And your program will rely on these values on
startup. Since there's no way to remember the contents of the data segment,
other than storing it into one of the ROMs, we have to put it there. But
unfortunately, ROM is not writable, so we have to copy it into RAM before
running the actual code.</P>
<P>The linker won't copy the data from ROM into RAM for you (this must be done by
the startup code of your program), but it has some features that will help you
in this process.</P>
<P>First, you may not only specify a "<CODE>load</CODE>" attribute for a segment, but
also a "<CODE>run</CODE>" attribute. The "<CODE>load</CODE>" attribute is mandatory, and, if
you don't specify a "<CODE>run</CODE>" attribute, the linker assumes that load area
and run area are the same. We will use this feature for our data area:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
SEGMENTS {
CODE: load = ROM1, type = ro;
RODATA: load = ROM2, type = ro;
DATA: load = ROM2, run = RAM2, type = rw, define = yes;
BSS: load = RAM2, type = bss, define = yes;
}
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>Let's have a closer look at this <CODE>SEGMENTS</CODE> section. We specify that the
<CODE>CODE</CODE> segment goes into <CODE>ROM1</CODE> (the one at $A000). The readonly data
goes into <CODE>ROM2</CODE>. Read/write data will be loaded into <CODE>ROM2</CODE> but is run
in <CODE>RAM2</CODE>. That means that all references to labels in the <CODE>DATA</CODE>
segment are relocated to be in <CODE>RAM2</CODE>, but the segment is written to
<CODE>ROM2</CODE>. All your startup code has to do is, to copy the data from its
location in <CODE>ROM2</CODE> to the final location in <CODE>RAM2</CODE>.</P>
<P>So, how do you know, where the data is located? This is the second point,
where you get help from the linker. Remember the "<CODE>define</CODE>" attribute?
Since we have set this attribute to true, the linker will define three
external symbols for the data segment that may be accessed from your code:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
__DATA_LOAD__ This is set to the address where the segment
is loaded, in this case, it is an address in
ROM2.
__DATA_RUN__ This is set to the run address of the segment,
in this case, it is an address in RAM2.
__DATA_SIZE__ This is set to the segment size.
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>So, what your startup code must do, is to copy <CODE>__DATA_SIZE__</CODE> bytes from
<CODE>__DATA_LOAD__</CODE> to <CODE>__DATA_RUN__</CODE> before any other routines are called.
All references to labels in the <CODE>DATA</CODE> segment are relocated to <CODE>RAM2</CODE>
by the linker, so things will work properly.</P>
<P>There's a library subroutine called <CODE>copydata</CODE> (in a module named
<CODE>copydata.s</CODE>) that might be used to do actual copying. Be sure to have a
look at it's inner workings before using it!</P>
<H2><A NAME="MEMORY"></A> <A NAME="ss5.6">5.6</A> <A HREF="#toc5.6">Other MEMORY area attributes</A>
</H2>
<P>There are some other attributes not covered above. Before starting the
reference section, I will discuss the remaining things here.</P>
<P>You may request symbols definitions also for memory areas. This may be
useful for things like a software stack, or an I/O area.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
MEMORY {
STACK: start = $C000, size = $1000, define = yes;
}
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>This will define some external symbols that may be used in your code when
imported using the <CODE>.IMPORT</CODE> directive:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
__STACK_START__ This is set to the start of the memory
area, $C000 in this example.
__STACK_SIZE__ The size of the area, here $1000.
__STACK_LAST__ This is NOT the same as START+SIZE.
Instead, it is defined as the first
address that is not used by data. If we
don't define any segments for this area,
the value will be the same as START.
__STACK_FILEOFFS__ The binary offset in the output file. This
is not defined for relocatable output file
formats (o65).
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>A memory section may also have a type. Valid types are</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
ro for readonly memory
rw for read/write memory.
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>The linker will assure, that no segment marked as read/write or bss is put
into a memory area that is marked as readonly.</P>
<P>Unused memory in a memory area may be filled. Use the "<CODE>fill = yes</CODE>"
attribute to request this. The default value to fill unused space is zero. If
you don't like this, you may specify a byte value that is used to fill these
areas with the "<CODE>fillval</CODE>" attribute. If there is no "<CODE>fillval</CODE>"
attribute for the segment, the "<CODE>fillval</CODE>" attribute of the memory area (or
its default) is used instead. This means that the value may also be used to
fill unfilled areas generated by the assembler's <CODE>.ALIGN</CODE> and <CODE>.RES</CODE>
directives.</P>
<P>The symbol <CODE>%S</CODE> may be used to access the default start address (that is,
the one defined in
<A HREF="#FEATURES">the FEATURES section</A>, or the
value given on the command line with the <CODE>
<A HREF="#option-S">-S</A></CODE>
option).</P>
<P>To support systems with banked memory, a special attribute named <CODE>bank</CODE> is
available. The attribute value is an arbitrary 32-bit integer. The assembler
has a builtin function named <CODE>.BANK</CODE> which may be used with an argument
that has a segment reference (for example a symbol). The result of this
function is the value of the bank attribute for the run memory area of the
segment.</P>
<H2><A NAME="ss5.7">5.7</A> <A HREF="#toc5.7">Other SEGMENT attributes</A>
</H2>
<P>Segments may be aligned to some memory boundary. Specify "<CODE>align = num</CODE>" to
request this feature. To align all segments on a page boundary, use</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
SEGMENTS {
CODE: load = ROM1, type = ro, align = $100;
RODATA: load = ROM2, type = ro, align = $100;
DATA: load = ROM2, run = RAM2, type = rw, define = yes,
align = $100;
BSS: load = RAM2, type = bss, define = yes, align = $100;
}
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>If an alignment is requested, the linker will add enough space to the output
file, so that the new segment starts at an address that is dividable by the
given number without a remainder. All addresses are adjusted accordingly. To
fill the unused space, bytes of zero are used, or, if the memory area has a
"<CODE>fillval</CODE>" attribute, that value. Alignment is always needed, if you have
used the <CODE>.ALIGN</CODE> command in the assembler. The alignment of a segment
must be equal or greater than the alignment used in the <CODE>.ALIGN</CODE> command.
The linker will check that, and issue a warning, if the alignment of a segment
is lower than the alignment requested in an <CODE>.ALIGN</CODE> command of one of the
modules making up this segment.</P>
<P>For a given segment you may also specify a fixed offset into a memory area or
a fixed start address. Use this if you want the code to run at a specific
address (a prominent case is the interrupt vector table which must go at
address $FFFA). Only one of <CODE>ALIGN</CODE> or <CODE>OFFSET</CODE> or <CODE>START</CODE> may be
specified. If the directive creates empty space, it will be filled with zero,
of with the value specified with the "<CODE>fillval</CODE>" attribute if one is given.
The linker will warn you if it is not possible to put the code at the
specified offset (this may happen if other segments in this area are too
large). Here's an example:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
SEGMENTS {
VECTORS: load = ROM2, type = ro, start = $FFFA;
}
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>or (for the segment definitions from above)</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
SEGMENTS {
VECTORS: load = ROM2, type = ro, offset = $1FFA;
}
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>The "<CODE>align</CODE>", "<CODE>start</CODE>" and "<CODE>offset</CODE>" attributes change placement
of the segment in the run memory area, because this is what is usually
desired. If load and run memory areas are equal (which is the case if only the
load memory area has been specified), the attributes will also work. There is
also an "<CODE>align_load</CODE>" attribute that may be used to align the start of the
segment in the load memory area, in case different load and run areas have
been specified. There are no special attributes to set start or offset for
just the load memory area.</P>
<P>A "<CODE>fillval</CODE>" attribute may not only be specified for a memory area, but
also for a segment. The value must be an integer between 0 and 255. It is used
as the fill value for space reserved by the assembler's <CODE>.ALIGN</CODE> and <CODE>.RES</CODE>
commands. It is also used as the fill value for space between sections (part of a
segment that comes from one object file) caused by alignment, but not for
space that precedes the first section.</P>
<P>To suppress the warning, the linker issues if it encounters a segment that is
not found in any of the input files, use "<CODE>optional=yes</CODE>" as an additional
segment attribute. Be careful when using this attribute, because a missing
segment may be a sign of a problem, and if you're suppressing the warning,
there is no one left to tell you about it.</P>
<H2><A NAME="ss5.8">5.8</A> <A HREF="#toc5.8">The FILES section</A>
</H2>
<P>The <CODE>FILES</CODE> section is used to support other formats than straight binary
(which is the default, so binary output files do not need an explicit entry
in the <CODE>FILES</CODE> section).</P>
<P>The <CODE>FILES</CODE> section lists output files and as only attribute the format of
each output file. Assigning binary format to the default output file would
look like this:</P>
<P>
<BLOCKQUOTE><CODE>