-
Notifications
You must be signed in to change notification settings - Fork 9
/
RFC3347.TXT
1459 lines (958 loc) · 56.7 KB
/
RFC3347.TXT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Network Working Group M. Krueger
Request for Comments: 3347 R. Haagens
Category: Informational Hewlett-Packard Corporation
C. Sapuntzakis
Stanford
M. Bakke
Cisco Systems
July 2002
Small Computer Systems Interface protocol over the Internet (iSCSI)
Requirements and Design Considerations
Status of this Memo
This memo provides information for the Internet community. It does
not specify an Internet standard of any kind. Distribution of this
memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (2002). All Rights Reserved.
Abstract
This document specifies the requirements iSCSI and its related
infrastructure should satisfy and the design considerations guiding
the iSCSI protocol development efforts. In the interest of timely
adoption of the iSCSI protocol, the IPS group has chosen to focus the
first version of the protocol to work with the existing SCSI
architecture and commands, and the existing TCP/IP transport layer.
Both these protocols are widely-deployed and well-understood. The
thought is that using these mature protocols will entail a minimum of
new invention, the most rapid possible adoption, and the greatest
compatibility with Internet architecture, protocols, and equipment.
Conventions used in this document
This document describes the requirements for a protocol design, but
does not define a protocol standard. Nevertheless, the key words
"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document
are to be interpreted as described in RFC-2119 [2].
Krueger, et al. Informational [Page 1]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
Table of Contents
1. Introduction.................................................2
2. Summary of Requirements......................................3
3. iSCSI Design Considerations..................................7
3.1. General Discussion...........................................7
3.2. Performance/Cost.............................................9
3.3. Framing.....................................................11
3.4. High bandwidth, bandwidth aggregation.......................13
4. Ease of implementation/complexity of protocol...............14
5. Reliability and Availability................................15
5.1. Detection of Data Corruption................................15
5.2. Recovery....................................................15
6. Interoperability............................................16
6.1. Internet infrastructure.....................................16
6.2. SCSI........................................................16
7. Security Considerations.....................................18
7.1. Extensible Security.........................................18
7.2. Authentication..............................................18
7.3. Data Integrity..............................................19
7.4. Data Confidentiality........................................19
8. Management..................................................19
8.1. Naming......................................................20
8.2. Discovery...................................................21
9. Internet Accessibility......................................21
9.1. Denial of Service...........................................21
9.2. NATs, Firewalls and Proxy servers...........................22
9.3. Congestion Control and Transport Selection..................22
10. Definitions.................................................22
11. References..................................................23
12. Acknowledgements............................................24
13. Author's Addresses..........................................25
14. Full Copyright Statement....................................26
1. Introduction
The IP Storage Working group is chartered with developing
comprehensive technology to transport block storage data over IP
protocols. This effort includes a protocol to transport the Small
Computer Systems Interface (SCSI) protocol over the Internet (iSCSI).
The initial version of the iSCSI protocol will define a mapping of
SCSI transport protocol over TCP/IP so that SCSI storage controllers
(principally disk and tape arrays and libraries) can be attached to
IP networks, notably Gigabit Ethernet (GbE) and 10 Gigabit Ethernet
(10 GbE).
Krueger, et al. Informational [Page 2]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
The iSCSI protocol is a mapping of SCSI to TCP, and constitutes a
"SCSI transport" as defined by the ANSI T10 document SCSI SAM-2
document [SAM2, p. 3, "Transport Protocols"].
2. Summary of Requirements
The iSCSI standard:
From section 3.2 Performance/Cost:
MUST allow implementations to equal or improve on the current
state of the art for SCSI interconnects.
MUST enable cost competitive implementations.
SHOULD minimize control overhead to enable low delay
communications.
MUST provide high bandwidth and bandwidth aggregation.
MUST have low host CPU utilizations, equal to or better than
current technology.
MUST be possible to build I/O adapters that handle the entire SCSI
task.
SHOULD permit direct data placement architectures.
MUST NOT impose complex operations on host software.
MUST provide for full utilization of available link bandwidth.
MUST allow an implementation to exploit parallelism (multiple
connections) at the device interfaces and within the interconnect
fabric.
From section 3.4 High Bandwidth/Bandwidth Aggregation:
MUST operate over a single TCP connection.
SHOULD support 'connection binding', and it MUST be optional to
implement.
From section 4 Ease of Implementation/Complexity of Protocol:
SHOULD keep the protocol simple.
SHOULD minimize optional features.
Krueger, et al. Informational [Page 3]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
MUST specify feature negotiation at session establishment (login).
MUST operate correctly when no optional features are negotiated as
well as when individual option negotions are unsuccessful.
From section 5.1 Detection of Data Corruption:
MUST support a data integrity check format for use in digest
generation.
MAY use separate digest for data and headers.
iSCSI header format SHOULD be extensible to include other data
integrity digest calculation methods.
From section 5.2 Recovery:
MUST specify mechanisms to recover in a timely fashion from
failures on the initiator, target, or connecting infrastructure.
MUST specify recovery methods for non-idempotent requests.
SHOULD take into account fail-over schemes for mirrored targets or
highly available storage configurations.
SHOULD provide a method for sessions to be gracefully terminated
and restarted that can be initiated by either the initiator or
target.
From section 6 Interoperability:
iSCSI protocol document MUST be clear and unambiguous.
From section 6.1 Internet Infrastructure:
MUST:
-- be compatible with both IPv4 and IPv6
-- use TCP connections conservatively, keeping in mind there may
be many other users of TCP on a given machine.
MUST NOT require changes to existing Internet protocols.
SHOULD minimize required changes to existing TCP/IP
implementations.
MUST be designed to allow future substitution of SCTP (for TCP) as
an IP transport protocol with minimal changes to iSCSI protocol
operation, protocol data unit (PDU) structures and formats.
Krueger, et al. Informational [Page 4]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
From section 6.2 SCSI:
Any feature SAM2 requires in a valid transport mapping MUST be
specified by iSCSI.
MUST specify strictly ordered delivery of SCSI commands over an
iSCSI session between an initiator/target pair.
The command ordering mechanism SHOULD seek to minimize the amount
of communication necessary across multiple adapters doing
transport off-load.
MUST specify for each feature whether it is OPTIONAL, RECOMMENDED
or REQUIRED to implement and/or use.
MUST NOT require changes to the SCSI-3 command sets and SCSI
client code except except where SCSI specifications point to
"transport dependent" fields and behavior.
SHOULD track changes to SCSI and the SCSI Architecture Model.
MUST be capable of supporting all SCSI-3 command sets and device
types.
SHOULD support ACA implementation.
MUST allow for the construction of gateways to other SCSI
transports
MUST reliably transport SCSI commands from the initiator to the
target.
MUST correctly deal with iSCSI packet drop, duplication,
corruption, stale packets, and re-ordering.
From section 7.1 Extensible Security:
SHOULD require minimal configuration and overhead in the insecure
operation.
MUST provide for strong authentication when increased security is
required.
SHOULD allow integration of new security mechanisms without
breaking backwards compatible operation.
Krueger, et al. Informational [Page 5]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
From section 7.2 Authentication:
MAY support various levels of authentication security.
MUST support private authenticated login.
iSCSI authenticated login MUST be resilient against attacks.
MUST support data origin authentication of its communications;
data origin authentication MAY be optional to use.
From section 7.3 Data Integrity:
SHOULD NOT preclude use of additional data integrity protection
protocols (IPSec, TLS).
From section 7.4 Data Confidentiality:
MUST provide for the use of a data encryption protocol such as TLS
or IPsec ESP to provide data confidentiality between iSCSI
endpoints
From section 8 Management:
SHOULD be manageable using standard IP-based management protocols.
iSCSI protocol document MUST NOT define the management
architecture for iSCSI, or make explicit references to management
objects such as MIB variables.
From section 8.1 Naming:
MUST support the naming architecture of SAM-2. The means by which
an iSCSI resource is located MUST use or extend existing Internet
standard resource location methods.
MUST provide a means of identifying iSCSI targets by a unique
identifier that is independent of the path on which it is found.
The format for the iSCSI names MUST use existing naming
authorities.
An iSCSI name SHOULD be a human readable string in an
international character set encoding.
Standard Internet lookup services SHOULD be used to resolve iSCSI
names.
Krueger, et al. Informational [Page 6]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
SHOULD deal with the complications of the new SCSI security
architecture.
iSCSI naming architecture MUST address support of SCSI 3rd party
operations such as EXTENDED COPY.
From section 8.2 Discovery:
MUST have no impact on the use of current IP network discovery
techniques.
MUST provide some means of determining whether an iSCSI service is
available through an IP address.
SCSI protocol-dependent techniques SHOULD be used for further
discovery beyond the iSCSI layer.
MUST provide a method of discovering, given an IP end point on its
well-known port, the list of SCSI targets available to the
requestor. The use of this discovery service MUST be optional.
From section 9 Internet Accessability.
SHOULD be scrutinized for denial of service issues and they should
be addressed.
From section 9.2 Firewalls and Proxy Servers
SHOULD allow deployment where functional and optimizing middle-
boxes such as firewalls, proxy servers and NATs are present.
use of IP addresses and TCP ports SHOULD be firewall friendly.
From section 9.3 Congestion Control and Transport Selection
MUST be a good network citizen with TCP-compatible congestion
control (as defined in [RFC2914]).
iSCSI implementations MUST NOT use multiple connections as a means
to avoid transport-layer congestion control.
3. iSCSI Design Considerations
3.1. General Discussion
Traditionally, storage controllers (e.g., disk array controllers,
tape library controllers) have supported the SCSI-3 protocol and have
been attached to computers by SCSI parallel bus or Fibre Channel.
Krueger, et al. Informational [Page 7]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
The IP infrastructure offers compelling advantages for volume/
block-oriented storage attachment. It offers the opportunity to take
advantage of the performance/cost benefits provided by competition in
the Internet marketplace. This could reduce the cost of storage
network infrastructure by providing economies arising from the need
to install and operate only a single type of network.
In addition, the IP protocol suite offers the opportunity for a rich
array of management, security and QoS solutions. Organizations may
initially choose to operate storage networks based on iSCSI that are
independent of (isolated from) their current data networks except for
secure routing of storage management traffic. These organizations
anticipated benefits from the high performance/cost of IP equipment
and the opportunity for a unified management architecture. As
security and QoS evolve, it becomes reasonable to build combined
networks with shared infrastructure; nevertheless, it is likely that
sophisticated users will choose to keep their storage sub-networks
isolated to afford the best control of security and QoS to ensure a
high-performance environment tuned to storage traffic.
Mapping SCSI over IP also provides:
-- Extended distance ranges
-- Connectivity to "carrier class" services that support IP
The following applications for iSCSI are contemplated:
-- Local storage access, consolidation, clustering and pooling (as
in the data center)
-- Network client access to remote storage (eg. a "storage service
provider")
-- Local and remote synchronous and asynchronous mirroring between
storage controllers
-- Local and remote backup and recovery
iSCSI will support the following topologies:
-- Point-to-point direct connections
-- Dedicated storage LAN, consisting of one or more LAN segments
-- Shared LAN, carrying a mix of traditional LAN traffic plus
storage traffic
-- LAN-to-WAN extension using IP routers or carrier-provided "IP
Datatone"
-- Private networks and the public Internet
IP LAN-WAN routers may be used to extend the IP storage network to
the wide area, permitting remote disk access (as for a storage
utility), synchronous and asynchronous remote mirroring, and remote
Krueger, et al. Informational [Page 8]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
backup and restore (as for tape vaulting). In the WAN, using TCP
end-to-end avoids the need for specialized equipment for protocol
conversion, ensures data reliability, copes with network congestion,
and provides retransmission strategies adapted to WAN delays.
The iSCSI technology deployment will involve the following elements:
(1) Conclusion of a complete protocol standard and supporting
implementations;
(2) Development of Ethernet storage NICs and related driver and
protocol software; [NOTE: high-speed applications of iSCSI are
expected to require significant portions of the iSCSI/TCP/IP
implementation in hardware to achieve the necessary throughput.]
(3) Development of compatible storage controllers; and
(4) The likely development of translating gateways to provide
connectivity between the Ethernet storage network and the Fibre
Channel and/or parallel-bus SCSI domains.
(5) Development of specifications for iSCSI device management such
as MIBs, LDAP or XML schemas, etc.
(6) Development of management and directory service applications to
support a robust SAN infrastructure.
Products could initially be offered for Gigabit Ethernet attachment,
with rapid migration to 10 GbE. For performance competitive with
alternative SCSI transports, it will be necessary to implement the
performance path of the full protocol stack in hardware. These new
storage NICs might perform full-stack processing of a complete SCSI
task, analogous to today's SCSI and Fibre Channel HBAs, and might
also support all host protocols that use TCP (NFS, CIFS, HTTP, etc).
The charter of the IETF IP Storage Working Group (IPSWG) describes
the broad goal of mapping SCSI to IP using a transport that has
proven congestion avoidance behavior and broad implementation on a
variety of platforms. Within that broad charter, several transport
alternatives may be considered. Initial IPS work focuses on TCP, and
this requirements document is restricted to that domain of interest.
3.2. Performance/Cost
In general, iSCSI MUST allow implementations to equal or improve on
the current state of the art for SCSI interconnects. This goal
breaks down into several types of requirement:
Cost competitive with alternative storage network technologies:
In order to be adopted by vendors and the user community, the iSCSI
protocol MUST enable cost competitive implementations when compared
to other SCSI transports (Fibre Channel).
Krueger, et al. Informational [Page 9]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
Low delay communication:
Conventional storage access is of a stop-and-wait remote procedure
call type. Applications typically employ very little pipelining of
their storage accesses, and so storage access delay directly impacts
performance. The delay imposed by current storage interconnects,
including protocol processing, is generally in the range of 100
microseconds. The use of caching in storage controllers means that
many storage accesses complete almost instantly, and so the delay of
the interconnect can have a high relative impact on overall
performance. When stop-and-wait IO is used, the delay of the
interconnect will affect performance. The iSCSI protocol SHOULD
minimize control overhead, which adds to delay.
Low host CPU utilization, equal to or better than current technology:
For competitive performance, the iSCSI protocol MUST allow three key
implementation goals to be realized:
(1) iSCSI MUST make it possible to build I/O adapters that handle an
entire SCSI task, as alternative SCSI transport implementations
do.
(2) The protocol SHOULD permit direct data placement ("zero-copy"
memory architectures, where the I/O adapter reads or writes host
memory exactly once per disk transaction.
(3) The protocol SHOULD NOT impose complex operations on the host
software, which would increase host instruction path length
relative to alternatives.
Direct data placement (zero-copy iSCSI):
Direct data placement refers to iSCSI data being placed directly "off
the wire" into the allocated location in memory with no intermediate
copies. Direct data placement significantly reduces the memory bus
and I/O bus loading in the endpoint systems, allowing improved
performance. It reduces the memory required for NICs, possibly
reducing the cost of these solutions.
This is an important implementation goal. In an iSCSI system, each
of the end nodes (for example host computer and storage controller)
should have ample memory, but the intervening nodes (NIC, switches)
typically will not.
Krueger, et al. Informational [Page 10]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
High bandwidth, bandwidth aggregation:
The bandwidth (transfer rate, MB/sec) supported by storage
controllers is rapidly increasing, due to several factors:
1. Increase in disk spindle and controller performance;
2. Use of ever-larger caches, and improved caching algorithms;
3. Increased scale of storage controllers (number of supported
spindles, speed of interconnects).
The iSCSI protocol MUST provide for full utilization of available
link bandwidth. The protocol MUST also allow an implementation to
exploit parallelism (multiple connections) at the device interfaces
and within the interconnect fabric.
The next two sections further discuss the need for direct data
placement and high bandwidth.
3.3. Framing
Framing refers to the addition of information in a header, or the
data stream to allow implementations to locate the boundaries of an
iSCSI protocol data unit (PDU) within the TCP byte stream. There are
two technical requirements driving framing: interfacing needs, and
accelerated processing needs.
A framing solution that addresses the "interfacing needs" of the
iSCSI protocol will facilitate the implementation of a message-based
upper layer protocol (iSCSI) on top of an underlying byte streaming
protocol (TCP). Since TCP is a reliable transport, this can be
accomplished by including a length field in the iSCSI header. Finding
the protocol frame assumes that the receiver will parse from the
beginning of the TCP data stream, and never make a mistake (lose
alignment on packet headers).
The other technical requirement for framing, "accelerated
processing", stems from the need to handle increasingly higher data
rates in the physical media interface. Two needs arise from higher
data rates:
(1) LAN environment - NIC vendors seek ways to provide "zero-copy"
methods of moving data directly from the wire into application
buffers.
(2) WAN environment- the emergence of high bandwidth, high latency,
low bit error rate physical media places huge buffer
requirements on the physical interface solutions.
Krueger, et al. Informational [Page 11]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
First, vendors are producing network processing hardware that
offloads network protocols to hardware solutions to achieve higher
data rates. The concept of "zero-copy" seeks to store blocks of data
in appropriate memory locations (aligned) directly off the wire, even
when data is reordered due to packet loss. This is necessary to
drive actual data rates of 10 Gigabit/sec and beyond.
Secondly, in order for iSCSI to be successful in the WAN arena it
must be possible to operate efficiently in high bandwidth, high delay
networks. The emergence of multi-gigabit IP networks with latencies
in the tens to hundreds of milliseconds presents a challenge. To
fill such large pipes, it is necessary to have tens of megabytes of
outstanding requests from the application. In addition, some
protocols potentially require tens of megabytes at the transport
layer to deal with buffering for reassembly of data when packets are
received out-of-order.
In both cases, the issue is the desire to minimize the amount of
memory and memory bandwidth required for iSCSI hardware solutions.
Consider that a network pipe at 10 Gbps x 200 msec holds 250 MB.
[Assume land-based communication with a spot half way around the
world at the equator. Ignore additional distance due to cable
routing. Ignore repeater and switching delays; consider only a
speed-of-light delay of 5 microsec/km. The circumference of the
globe at the equator is approx. 40000 km (round-trip delay must be
considered to keep the pipe full). 10 Gb/sec x 40000 km x 5
microsec/km x B / 8b = 250 MB]. In a conventional TCP
implementation, loss of a TCP segment means that stream processing
MUST stop until that segment is recovered, which takes at least a
time of <network round trip> to accomplish. Following the example
above, an implementation would be obliged to catch 250 MB of data
into an anonymous buffer before resuming stream processing; later,
this data would need to be moved to its proper location. Some
proponents of iSCSI seek some means of putting data directly where it
belongs, and avoiding extra data movement in the case of segment
drop. This is a key concept in understanding the debate behind
framing methodologies.
The framing of the iSCSI protocol impacts both the "interfacing
needs" and the "accelerated processing needs", however, while
including a length in a header may suffice for the "interfacing
needs", it will not serve the direct data placement needs. The
framing mechanism developed should allow resynchronization of packet
boundaries even in the case where a packet is temporarily missing in
the incoming data stream.
Krueger, et al. Informational [Page 12]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
3.4. High bandwidth, bandwidth aggregation
At today's block storage transport throughput, any single link can be
saturated by the volume of storage traffic. Scientific data
applications and data replication are examples of storage
applications that push the limits of throughput.
Some applications, such as log updates, streaming tape, and
replication, require ordering of updates and thus ordering of SCSI
commands. An initiator may maintain ordering by waiting for each
update to complete before issuing the next (a.k.a. synchronous
updates). However, the throughput of synchronous updates decreases
inversely with increases in network distances.
For greater throughput, the SCSI task queuing mechanism allows an
initiator to have multiple commands outstanding at the target
simultaneously and to express ordering constraints on the execution
of those commands. The task queuing mechanism is only effective if
the commands arrive at the target in the order they were presented to
the initiator (FIFO order). The iSCSI standard must provide an
ordered transport of SCSI commands, even when commands are sent along
different network paths (see Section 5.2 SCSI). This is referred to
as "command ordering".
The iSCSI protocol MUST operate over a single TCP connection to
accommodate lower cost implementations. To enable higher performance
storage devices, the protocol should specify a means to allow
operation over multiple connections while maintaining the behavior of
a single SCSI port. This would allow the initiator and target to use
multiple network interfaces and multiple paths through the network
for increased throughput. There are a few potential ways to satisfy
the multiple path and ordering requirements.
A popular way to satisfy the multiple-path requirement is to have a
driver above the SCSI layer instantiate multiple copies of the SCSI
transport, each communicating to the target along a different path.
"Wedge" drivers use this technique today to attain high performance.
Unfortunately, wedge drivers must wait for acknowledgement of
completion of each request (stop-and-wait) to ensure ordered updates.
Another approach might be for iSCSI protocol to use multiple
instances of its underlying transport (e.g. TCP). The iSCSI layer
would make these independent transport instances appear as one SCSI
transport instance and maintain the ability to do ordered SCSI
command queuing. The document will refer to this technique as
"connection binding" for convenience.
Krueger, et al. Informational [Page 13]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
The iSCSI protocol SHOULD support connection binding, and it MUST be
optional to implement.
In the presence of connection binding, there are two ways to assign
features to connections. In the symmetric approach, all the
connections are identical from a feature standpoint. In the
asymmetric model, connections have different features. For example,
some connections may be used primarily for data transfers whereas
others are used primarily for SCSI commands.
Since the iSCSI protocol must support the case where there was only
one transport connection, the protocol must have command, data, and
status travel over the same connection.
In the case of multiple connections, the iSCSI protocol must keep the
command and its associated data and status on the same connection
(connection allegiance). Sending data and status on the same
connection is desirable because this guarantees that status is
received after the data (TCP provides ordered delivery). In the case
where each connection is managed by a separate processor, allegiance
decreases the need for inter-processor communication. This symmetric
approach is a natural extension of the single connection approach.
An alternate approach that was extensively discussed involved sending
all commands on a single connection and the associated data and
status on a different connection (asymmetric approach). In this
scheme, the transport ensures the commands arrive in order. The
protocol on the data and status connections is simpler, perhaps
lending itself to a simpler realization in hardware. One
disadvantage of this approach is that the recovery procedure is
different if a command connection fails vs. a data connection. Some
argued that this approach would require greater inter-processor
communication when connections are spread across processors.
The reader may reference the mail archives of the IPS mailing list
between June and September of 2000 for extensive discussions on
symmetric vs asymmetric connection models.
4. Ease of implementation/complexity of protocol
Experience has shown that adoption of a protocol by the Internet
community is inversely proportional to its complexity. In addition,
the simpler the protocol, the easier it is to diagnose problems. The
designers of iSCSI SHOULD strive to fulfill the requirements of the
creating a SCSI transport over IP, while keeping the protocol as
simple as possible.
Krueger, et al. Informational [Page 14]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
In the interest of simplicity, iSCSI SHOULD minimize optional
features. When features are deemed necessary, the protocol MUST
specify feature negotiation at session establishment (login). The
iSCSI transport MUST operate correctly when no optional features are
negotiated as well as when individual option negotiations are
unsuccessful.
5. Reliability and Availability
5.1. Detection of Data Corruption
There have been several research papers that suggest that the TCP
checksum calculation allows a certain number of bit errors to pass
undetected [10] [11].
In order to protect against data corruption, the iSCSI protocol MUST
support a data integrity check format for use in digest generation.
The iSCSI protocol MAY use separate digests for data and headers. In
an iSCSI proxy or gateway situation, the iSCSI headers are removed
and re-built, and the TCP stream is terminated on either side. This
means that even the TCP checksum is removed and recomputed within the
gateway. To ensure the protection of commands, data, and status the
iSCSI protocol MUST include a CRC or other digest mechanism that is
computed on the SCSI data block itself, as well as on each command
and status message. Since gateways may strip iSCSI headers and
rebuild them, a separate header CRC is required. Two header digests,
one for invariant portions of the header (addresses) and one for the
variant portion would provide protection against changes to portions
of the header that should never be changed by middle boxes (eg,
addresses).
The iSCSI header format SHOULD be extensible to include other digest
calculation methods.
5.2. Recovery
The SCSI protocol was originally designed for a parallel bus
transport that was highly reliable. SCSI applications tend to assume
that transport errors never happen, and when they do, SCSI
application recovery tends to be expensive in terms of time and
computational resources.
iSCSI protocol design, while placing an emphasis on simplicity, MUST
lead to timely recovery from failure of initiator, target, or
connecting network infrastructure (cabling, data path equipment such
as routers, etc).
Krueger, et al. Informational [Page 15]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
iSCSI MUST specify recovery methods for non-idempotent requests, such
as operations on tape drives.
The iSCSI protocol error recover mechanism SHOULD take into account
fail-over schemes for mirrored targets or highly available storage
configurations that provide paths to target data through multiple
"storage servers". This would provide a basis for layered
technologies like high availability and clustering.
The iSCSI protocol SHOULD also provide a method for sessions to be
gracefully terminated and restarted that can be initiated by either
the initiator or target. This provides the ability to gracefully
fail over an initiator or target, or reset a target after performing
maintenance tasks such as upgrading software.
6. Interoperability
It must be possible for initiators and targets that implement the
required portions of the iSCSI specification to interoperate. While
this requirement is so obvious that it doesn't seem worth mentioning,
if the protocol specification contains ambiguous wording, different
implementations may not interoperate. The iSCSI protocol document
MUST be clear and unambiguous.
6.1. Internet infrastructure
The iSCSI protocol MUST:
-- be compatible with both IPv4 and IPv6.
-- use TCP connections conservatively, keeping in mind there may
be many other users of TCP on a given machine.
The iSCSI protocol MUST NOT require changes to existing Internet
protocols and SHOULD minimize required changes to existing TCP/IP
implementations.
iSCSI MUST be designed to allow future substitution of SCTP (for TCP)
as an IP transport protocol with minimal changes to iSCSI protocol
operation, protocol data unit (PDU) structures and formats. Although
not widely implemented today, SCTP has many design features that make
it a desirable choice for future iSCSI enhancement.
6.2. SCSI
In order to be considered a SCSI transport, the iSCSI standard must
comply with the requirements of the SCSI Architecture Model [SAM-2]
for a SCSI transport. Any feature SAM2 requires in a valid transport
mapping MUST be specified by iSCSI. The iSCSI protocol document MUST
Krueger, et al. Informational [Page 16]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
specify for each feature whether it is OPTIONAL, RECOMMENDED or
REQUIRED to implement and/or use.
The SCSI Architectural Model [SAM-2] indicates an expectation that
the SCSI transport provides ordering of commands on an initiator
target-LUN granularity. There has been much discussion on the IPS
reflector and in working group meetings regarding the means to ensure
this ordering. The rough consensus is that iSCSI MUST specify
strictly ordered delivery of SCSI commands over an iSCSI session
between an initiator/target pair, even in the presence of transport
errors. This command ordering mechanism SHOULD seek to minimize the
amount of communication necessary across multiple adapters doing
transport off-load. If an iSCSI implementation does not require
ordering it can instantiate multiple sessions per initiator-target
pair.
iSCSI is intended to be a new SCSI "transport" [SAM2]. As a mapping
of SCSI over TCP, iSCSI requires interaction with both T10 and IETF.
However, the iSCSI protocol MUST NOT require changes to the SCSI-3
command sets and SCSI client code except where SCSI specifications
point to "transport dependent" fields and behavior. For example,
changes to SCSI documents will be necessary to reflect lengthier
iSCSI target names and potentially lengthier timeouts. Collaboration
with T10 will be necessary to achieve this requirement.
The iSCSI protocol SHOULD track changes to SCSI and the SCSI
Architecture Model.
The iSCSI protocol MUST be capable of supporting all SCSI-3 command
sets and device types. The primary focus is on supporting 'larger'
devices: host computers and storage controllers (disk arrays, tape
libraries). However, other command sets (printers, scanners) must be
supported. These requirements MUST NOT be construed to mean that
iSCSI must be natively implementable on all of today's SCSI devices,
which might have limited processing power or memory.
ACA (Auto Contingent Allegiance) is an optional SCSI mechanism that
stops execution of a sequence of dependent SCSI commands when one of
them fails. The situation surrounding it is complex - T10 specifies
ACA in SAM2, and hence iSCSI must support it and endeavor to make
sure that ACA gets implemented sufficiently (two independent
interoperable implementations) to avoid dropping ACA in the
transition from Proposed Standard to Draft Standard. This implies
iSCSI SHOULD support ACA implementation.
The iSCSI protocol MUST allow for the construction of gateways to
other SCSI transports, including parallel SCSI [SPI-X] and to SCSI
FCP[FCP, FCP-2]. It MUST be possible to construct "translating"
Krueger, et al. Informational [Page 17]
RFC 3347 iSCSI Requirements and Design Considerations July 2002
gateways so that iSCSI hosts can interoperate with SCSI-X devices; so
that SCSI-X devices can communicate over an iSCSI network; and so
that SCSI-X hosts can use iSCSI targets (where SCSI-X refers to
parallel SCSI, SCSI-FCP, or SCSI over any other transport). This
requirement is implied by support for SAM-2, but is worthy of
emphasis. These are true application protocol gateways, and not just
bridge/routers. The different standards have only the SCSI-3 command
set layer in common. These gateways are not mere packet forwarders.
The iSCSI protocol MUST reliably transport SCSI commands from the
initiator to the target. According to [SAM-2, p. 17.] "The function
of the service delivery subsystem is to transport an error-free copy
of the request or response between the sender and the receiver"
[SAM-2, p. 22]. The iSCSI protocol MUST correctly deal with iSCSI
packet drop, duplication, corruption, stale packets, and re-ordering.
7. Security Considerations
In the past, directly attached storage systems have implemented
minimal security checks because the physical connection offered
little chance for attack. Transporting block storage (SCSI) over IP
opens a whole new opportunity for a variety of malicious attacks.
Attacks can take the active form (identity spoofing, man-in-the-
middle) or the passive form (eavesdropping).
7.1. Extensible Security
The security services required for communications depends on the
individual network configurations and environments. Organizations
are setting up Virtual Private Networks(VPN), also known as
Intranets, that will require one set of security functions for
communications within the VPN and possibly many different security
functions for communications outside the VPN to support
geographically separate components. The iSCSI protocol is applicable
to a wide range of internet working environments that may employ
different security policies. iSCSI MUST provide for strong
authentication when increased security is required. The protocol
SHOULD require minimal configuration and overhead in the insecure
operation, and allow integration of new security mechanisms without
breaking backwards compatible operation.
7.2. Authentication