Skip to content

3_0 Protocol Details

Michael Eder edited this page Dec 29, 2024 · 1 revision

The NFS protocol is based on Sun RPC and XDR as the data serialization format.

XDR

XDR is the data serialization format used by NFS and related protocols. It defines a way to encode basic data types, lists, enums, structs and tagged unions. XDR also provides a C-like language to define data types which can be used to automatically generate classes, serialization and deserialization code for multiple programming languages.

Sun RPC

NFS and related protocols are based on Sun RPC (= ONC RPC). Originally, UDP was used as the transport protocol, newer versions also support TCP and other protocols as well as RDMA for high performance applications.

The authentication flavours and RPC-with-TLS are part of the RPC layer and can also be used for other RPC-based protocols.

The RPC protocol allows a client to perform operations on a server by calling a procedure which is identified by a number and providing arguments. The server then returns a result. For example, if a client wants to read a file in NFSv3, it calls the READ procedure, which has the ID 6 and transmits an XDR-encoded READ3args struct containing parameters like the file handle to identify the file, the position in the file and the byte count. The server responds with a READ3res tagged union which can either contain a READ3resok struct in case the operation was successful or a READ3resfail struct if there was an error.

Protocol Versions

The two versions of NFS that are prevalent and supported by most server implementations today are NFSv3 and NFSv4.

NFSv3

NFSv3 requires three protocols to work. Apart from the NFS protocol which is used to perform the actual file operations, there are two other services that a client has to contact before it can start using an NFS export: rpcbind and mount, which will be explained later. There are also some other optional protocols like nlockmgr which is responsible for file locking and nfs_acl which is used to control access to files.

NFS

The NFS protocol is used for the actual file operations. One of the design goals of the NFS protocol is that the server is stateless in order to simplify crash recovery. For this reason, there is no need to open or close files. If a client has a file handle, it can perform read and write operations.

File handles

In NFS, files and directories are identified using file handles, not by their name or path. When a client wants the server to perform an operation on an object, it has to provide a file handle.

On the protocol level, a file handle is represented in XDR as a variable length byte array. The structure and information stored in the file handle is not standardized and determined by the server implementation. Clients should only use file handles that they have received from the server, they are not supposed to interpret or modify them. Later we will look at the internal structure of file handles generated by Linux and Windows servers. If a client has a file handle for a specific file, it should remain valid and point to the same file after a server restart and even when the file is renamed or moved to a different directory in order to imitate the properties of file descriptors that some applications rely on.

Servers implement different mechanisms to prevent clients from tampering with file handles. Linux can verify that file handles provided by the client point to files within the export if the subtree_check option is set. Windows appends a signature to each file handle by default.

In order to obtain a file handle for a specific file or directory, a client has to know the file handle of its parent directory and perform a LOOKUP or READDIRPLUS operation in order to request the file handle it needs. Initially the client does not know any file handle and cannot do anything. This is why the mount protocol is necessary. Before a client can use an NFS export, it has to obtain the export directory's file handle from the mount daemon. The client can then use this file handle to recursively request any other file handle in the export.

rpcbind

Some ports for Sun RPC protocols are dynamically assigned by rpcbind (also called portmap or portmapper) at runtime. A client that wants to use an RPC service on the server has to request the port from rpcbind which is always running on port 111. rpcbind provides a list of all supported services, versions, ports and if they use TCP or UDP as the transport protocol.

It is possible to see all protocols supported by a server with the rpcinfo command:

$ rpcinfo -p <HOST>
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100024    1   udp  44874  status
    100024    1   tcp  43269  status
    100005    1   udp  47905  mountd
    100005    1   tcp  42015  mountd
    100005    2   udp  50936  mountd
    100005    2   tcp  53331  mountd
    100005    3   udp  43168  mountd
    100005    3   tcp  38723  mountd
    100003    3   tcp   2049  nfs
    100003    4   tcp   2049  nfs
    100227    3   tcp   2049  nfs_acl
    100021    1   udp  39309  nlockmgr
    100021    3   udp  39309  nlockmgr
    100021    4   udp  39309  nlockmgr
    100021    1   tcp  42865  nlockmgr
    100021    3   tcp  42865  nlockmgr
    100021    4   tcp  42865  nlockmgr

This output shows that the server supports NFS versions 3 and 4 via TCP on port 2049 as well as multiple versions of the mount protocol on some other ports. Typically, NFS will always run on port 2049 but mount and other protocols will have dynamically assigned port numbers. On Windows, all protocols run on port 2049.

If a server only supports NFSv4, rpcbind might not be running becaue it is not needed anymore.

mount

The mount protocol is used by clients to request information about available exports on a server and to obtain the file handle of the export root directory as previously mentioned. It also reports a list of hosts that is allowed to access the export. This information can be viewed with showmount -e. This performs an EXPORT operation on the server and prints the results.

$ showmount -e <HOST>
Export list for 192.168.247.129:
/nfs_test/wildcard       *
/nfs_test/netgroup       @testgroup
/nfs_test/wildcard2      192.168.247.1?8
/nfs_test/dns            *.ipa.test
/nfs_test/subnet         192.168.248.0/255.255.255.0,192.168.247.0/24
/nfs_test/hostname       192.168.247.128
/nfs_test/everyone       (everyone)
/nfs_test/subdirectory   192.168.247.128
/nfs_test                192.168.247.130,192.168.247.128

If a server only supports NFSv4, the mount service might not be running on the server since it is not used by NFSv4. In that case, showmount won't show any exports even though an NFS server is running.

mountd also keeps track of clients that have exports mounted. This information can also be viewed using showmount -a. This shows the results of the DUMP operation.

$ showmount -a <HOST>
All mount points on 192.168.247.129:
192.168.247.128:/nfs_test
192.168.247.128:/nfs_test/subdirectory
192.168.247.128:/nfs_test/wildcard
192.168.247.130:/nfs_test
192.168.247.130:/nfs_test/everyone

This information may however be unreliable. If a client crashes or unmounts an export without telling mountd, it will remain in the list. It is also possible that a client tells mountd that it unmounted the export but keeps accessing it. Clients that use NFSv4 don't show up in this list either. The Windows NFS server does not report any clients.

nfs_analyze is able to extract more information from mountd than showmount by pretending to be a client that wants to mount every export. It will send an MNT operation for each export reported by the EXPORT operation. The response to the MNT operation contains the authentication methods allowed for the requesting host and the export root file handle.

acl

The NFSv3 protocol itself does not support ACLs. There is an inofficial protocol that was developed by Oracle and is now also supported by Linux which allows clients to receive and modify ACLs on the server.

NFSv4

In NFSv4, there is only one service listening on port 2049, which makes rpcbind unnecessary. This also simplifies firewall configurations compared to the dynamic ports used by NFSv3 and its related protocols. In contrast to NFSv3 where clients receive a list of exported directories from mountd, in NFSv4 all exports are presented to the client as subdirectories of a virtual root directory. Clients can use the PUTROOTFH operation to get the file handle of this root directory which makes the mount-protocol unnecessary. In NFSv3, the mount-protocol was also used by the client to find out which authentication methods are supported for which export. Since the mount-protocol is not used in NFSv4, clients can either use the SECINFO operation to ask the server which authentication methods are supported for a specific file or they can just can try to perform an operation on the file and the server will return NFS4ERR_WRONGSEC if another authentication method is needed.

ACLs are now part of the NFS protocol itself which makes the separate acl protocol unnecessary.

Another change compared to NFSv3 is that it is now possible to perform multiple operations in one RPC request which can reduce the number of roundtrips. On the protocol level, there are only two procedures: NULL and COMPOUND. The COMPOUND procedure accepts an array of arguments for operations and returns an array of results. For example, if a client knows the file handle of a directory and wants to read a file in that directory but it only knows its name, not the file handle, it can send a LOOKUP and a READ operation in one request. The file handle returned by the LOOKUP will automatically be passed to the READ procedure.

Apart from NFSv4.0 there are two other minor versions 4.1 and 4.2 which introduce new performance enhancing features.