-
Notifications
You must be signed in to change notification settings - Fork 3
3_0 Protocol Details
The NFS protocol is based on Sun RPC and XDR as the data serialization format.
XDR is the data serialization format used by NFS and related protocols. It defines a way to encode basic data types, lists, enums, structs and tagged unions. XDR also provides a C-like language to define data types which can be used to automatically generate classes, serialization and deserialization code for multiple programming languages.
NFS and related protocols are based on Sun RPC (= ONC RPC). Originally, UDP was used as the transport protocol, newer versions also support TCP and other protocols as well as RDMA for high performance applications.
The authentication flavours and RPC-with-TLS are part of the RPC layer and can also be used for other RPC-based protocols.
The RPC protocol allows a client to perform operations on a server by calling a procedure which is identified by a number and providing arguments. The server then returns a result. For example, if a client wants to read a file in NFSv3, it calls the READ procedure, which has the ID 6 and transmits an XDR-encoded READ3args struct containing parameters like the file handle to identify the file, the position in the file and the byte count. The server responds with a READ3res tagged union which can either contain a READ3resok struct in case the operation was successful or a READ3resfail struct if there was an error.
The two versions of NFS that are prevalent and supported by most server implementations today are NFSv3 and NFSv4.
NFSv3 requires three protocols to work.
Apart from the NFS protocol which is used to perform the actual file operations, there are two other services that a client has to contact before it can start using an NFS export: rpcbind
and mount
, which will be explained later.
There are also some other optional protocols like nlockmgr
which is responsible for file locking and nfs_acl
which is used to control access to files.
The NFS protocol is used for the actual file operations. One of the design goals of the NFS protocol is that the server is stateless in order to simplify crash recovery. For this reason, there is no need to open or close files. If a client has a file handle, it can perform read and write operations.
In NFS, files and directories are identified using file handles, not by their name or path. When a client wants the server to perform an operation on an object, it has to provide a file handle.
On the protocol level, a file handle is represented in XDR as a variable length byte array. The structure and information stored in the file handle is not standardized and determined by the server implementation. Clients should only use file handles that they have received from the server, they are not supposed to interpret or modify them. Later we will look at the internal structure of file handles generated by Linux and Windows servers. If a client has a file handle for a specific file, it should remain valid and point to the same file after a server restart and even when the file is renamed or moved to a different directory in order to imitate the properties of file descriptors that some applications rely on.
Servers implement different mechanisms to prevent clients from tampering with file handles.
Linux can verify that file handles provided by the client point to files within the export if the subtree_check
option is set.
Windows appends a signature to each file handle by default.
In order to obtain a file handle for a specific file or directory, a client has to know the file handle of its parent directory and perform a LOOKUP
or READDIRPLUS
operation in order to request the file handle it needs.
Initially the client does not know any file handle and cannot do anything.
This is why the mount protocol is necessary. Before a client can use an NFS export, it has to obtain the export directory's file handle from the mount daemon.
The client can then use this file handle to recursively request any other file handle in the export.
Some ports for Sun RPC protocols are dynamically assigned by rpcbind
(also called portmap
or portmapper
) at runtime.
A client that wants to use an RPC service on the server has to request the port from rpcbind
which is always running on port 111.
rpcbind
provides a list of all supported services, versions, ports and if they use TCP or UDP as the transport protocol.
It is possible to see all protocols supported by a server with the rpcinfo
command:
$ rpcinfo -p <HOST>
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100024 1 udp 44874 status
100024 1 tcp 43269 status
100005 1 udp 47905 mountd
100005 1 tcp 42015 mountd
100005 2 udp 50936 mountd
100005 2 tcp 53331 mountd
100005 3 udp 43168 mountd
100005 3 tcp 38723 mountd
100003 3 tcp 2049 nfs
100003 4 tcp 2049 nfs
100227 3 tcp 2049 nfs_acl
100021 1 udp 39309 nlockmgr
100021 3 udp 39309 nlockmgr
100021 4 udp 39309 nlockmgr
100021 1 tcp 42865 nlockmgr
100021 3 tcp 42865 nlockmgr
100021 4 tcp 42865 nlockmgr
This output shows that the server supports NFS versions 3 and 4 via TCP on port 2049 as well as multiple versions of the mount protocol on some other ports. Typically, NFS will always run on port 2049 but mount and other protocols will have dynamically assigned port numbers. On Windows, all protocols run on port 2049.
If a server only supports NFSv4, rpcbind
might not be running becaue it is not needed anymore.
The mount protocol is used by clients to request information about available exports on a server and to obtain the file handle of the export root directory as previously mentioned. It also reports a list of hosts that is allowed to access the export. This information can be viewed with showmount -e
.
This performs an EXPORT
operation on the server and prints the results.
$ showmount -e <HOST>
Export list for 192.168.247.129:
/nfs_test/wildcard *
/nfs_test/netgroup @testgroup
/nfs_test/wildcard2 192.168.247.1?8
/nfs_test/dns *.ipa.test
/nfs_test/subnet 192.168.248.0/255.255.255.0,192.168.247.0/24
/nfs_test/hostname 192.168.247.128
/nfs_test/everyone (everyone)
/nfs_test/subdirectory 192.168.247.128
/nfs_test 192.168.247.130,192.168.247.128
If a server only supports NFSv4, the mount service might not be running on the server since it is not used by NFSv4. In that case, showmount
won't show any exports even though an NFS server is running.
mountd
also keeps track of clients that have exports mounted. This information can also be viewed using showmount -a
.
This shows the results of the DUMP
operation.
$ showmount -a <HOST>
All mount points on 192.168.247.129:
192.168.247.128:/nfs_test
192.168.247.128:/nfs_test/subdirectory
192.168.247.128:/nfs_test/wildcard
192.168.247.130:/nfs_test
192.168.247.130:/nfs_test/everyone
This information may however be unreliable. If a client crashes or unmounts an export without telling mountd
, it will remain in the list. It is also possible that a client tells mountd
that it unmounted the export but keeps accessing it. Clients that use NFSv4 don't show up in this list either.
The Windows NFS server does not report any clients.
nfs_analyze
is able to extract more information from mountd
than showmount
by pretending to be a client that wants to mount every export. It will send an MNT
operation for each export reported by the EXPORT
operation. The response to the MNT
operation contains the authentication methods allowed for the requesting host and the export root file handle.
The NFSv3 protocol itself does not support ACLs. There is an inofficial protocol that was developed by Oracle and is now also supported by Linux which allows clients to receive and modify ACLs on the server.
In NFSv4, there is only one service listening on port 2049, which makes rpcbind
unnecessary.
This also simplifies firewall configurations compared to the dynamic ports used by NFSv3 and its related protocols.
In contrast to NFSv3 where clients receive a list of exported directories from mountd
, in NFSv4 all exports are presented to the client as subdirectories of a virtual root directory.
Clients can use the PUTROOTFH
operation to get the file handle of this root directory which makes the mount
-protocol unnecessary.
In NFSv3, the mount
-protocol was also used by the client to find out which authentication methods are supported for which export.
Since the mount
-protocol is not used in NFSv4, clients can either use the SECINFO
operation to ask the server which authentication methods are supported for a specific file or they can just can try to perform an operation on the file and the server will return NFS4ERR_WRONGSEC
if another authentication method is needed.
ACLs are now part of the NFS protocol itself which makes the separate acl
protocol unnecessary.
Another change compared to NFSv3 is that it is now possible to perform multiple operations in one RPC request which can reduce the number of roundtrips.
On the protocol level, there are only two procedures: NULL
and COMPOUND
.
The COMPOUND
procedure accepts an array of arguments for operations and returns an array of results.
For example, if a client knows the file handle of a directory and wants to read a file in that directory but it only knows its name, not the file handle, it can send a LOOKUP
and a READ
operation in one request. The file handle returned by the LOOKUP
will automatically be passed to the READ
procedure.
Apart from NFSv4.0 there are two other minor versions 4.1 and 4.2 which introduce new performance enhancing features.