Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control ANF availability zone vs VM scale set #1851

Open
ltalirz opened this issue Feb 21, 2024 · 1 comment
Open

Control ANF availability zone vs VM scale set #1851

ltalirz opened this issue Feb 21, 2024 · 1 comment
Labels
kind/feature New feature request

Comments

@ltalirz
Copy link
Contributor

ltalirz commented Feb 21, 2024

In what area(s)?

/area administration
/area ansible
/area autoscaling
/area configuration
/area cyclecloud
/area documentation
/area image
/area job-scheduling
/area monitoring
/area ood
/area remote-visualization
/area user-management

Describe the feature

Depending on whether the ANF volume is allocated in the same availability zone or in a different one, one can see very significant changes in latency

(e.g. for a 4TB ANF Premium with Basic networking, we've seen a drop from 1650 IOPS for 4KB random read to ~800 IOPS when the ANF was in a different availability zone from the D4ads v5 VM in West Europe).

It would be great if one could enforce the ANF and the scale sets to be placed in the same availability zone.

Test res

@ltalirz ltalirz added the kind/feature New feature request label Feb 21, 2024
@ltalirz
Copy link
Contributor Author

ltalirz commented Feb 21, 2024

P.S. We've also seen a dependence on latency on whether the ANF networking feature is set to Basic or Standard, with Standard resulting in a 60% higher latency if everything else is the same, including availability zone (same test setup as above).

This should be validated by others to see whether this number is roughly reproducible.

Test results

These actually show again a ~100% increase in latency (from 943 IOPS to 1925), it seems like this is not highly reproducible

Basic

[root@viz-3 anftest]# fio --name=random_read_test --ioengine=libaio --iodepth=1 --rw=randread --bs=4k --direct=1 --size=1G --numjobs=1 --runtime=15
random_read_test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.19
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=7924KiB/s][r=1981 IOPS][eta 00m:00s]
random_read_test: (groupid=0, jobs=1): err= 0: pid=68431: Wed Feb 21 20:27:45 2024
  read: IOPS=1923, BW=7692KiB/s (7877kB/s)(113MiB/15001msec)
    slat (nsec): min=1713, max=28633, avg=5589.18, stdev=1409.75
    clat (usec): min=234, max=6325, avg=513.46, stdev=191.25
     lat (usec): min=241, max=6331, avg=519.15, stdev=191.28
    clat percentiles (usec):
     |  1.00th=[  375],  5.00th=[  420], 10.00th=[  453], 20.00th=[  469],
     | 30.00th=[  478], 40.00th=[  482], 50.00th=[  490], 60.00th=[  498],
     | 70.00th=[  506], 80.00th=[  519], 90.00th=[  562], 95.00th=[  668],
     | 99.00th=[  906], 99.50th=[ 1188], 99.90th=[ 3785], 99.95th=[ 4555],
     | 99.99th=[ 5538]
   bw (  KiB/s): min= 5728, max= 8104, per=100.00%, avg=7703.38, stdev=577.84, samples=29
   iops        : min= 1432, max= 2026, avg=1925.83, stdev=144.48, samples=29
  lat (usec)   : 250=0.01%, 500=65.25%, 750=31.99%, 1000=2.12%
  lat (msec)   : 2=0.35%, 4=0.21%, 10=0.08%
  cpu          : usr=0.49%, sys=2.44%, ctx=28851, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=28848,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=7692KiB/s (7877kB/s), 7692KiB/s-7692KiB/s (7877kB/s-7877kB/s), io=113MiB (118MB), run=15001-15001msec

Standard

[root@viz-3 anfhome]# fio --name=random_read_test --ioengine=libaio --iodepth=1 --rw=randread --bs=4k --direct=1 --size=1G --numjobs=1 --runtime=15
random_read_test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.19
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=3552KiB/s][r=888 IOPS][eta 00m:00s]
random_read_test: (groupid=0, jobs=1): err= 0: pid=68514: Wed Feb 21 20:28:28 2024
  read: IOPS=937, BW=3749KiB/s (3839kB/s)(54.9MiB/15001msec)
    slat (nsec): min=1893, max=28644, avg=5840.79, stdev=1688.80
    clat (usec): min=876, max=23286, avg=1059.97, stdev=271.62
     lat (usec): min=880, max=23295, avg=1065.92, stdev=271.69
    clat percentiles (usec):
     |  1.00th=[  979],  5.00th=[  996], 10.00th=[ 1004], 20.00th=[ 1012],
     | 30.00th=[ 1020], 40.00th=[ 1020], 50.00th=[ 1029], 60.00th=[ 1037],
     | 70.00th=[ 1045], 80.00th=[ 1057], 90.00th=[ 1172], 95.00th=[ 1221],
     | 99.00th=[ 1450], 99.50th=[ 1631], 99.90th=[ 2737], 99.95th=[ 4228],
     | 99.99th=[13698]
   bw (  KiB/s): min= 3363, max= 3896, per=100.00%, avg=3774.17, stdev=115.42, samples=29
   iops        : min=  840, max=  974, avg=943.52, stdev=28.95, samples=29
  lat (usec)   : 1000=9.92%
  lat (msec)   : 2=89.85%, 4=0.18%, 10=0.04%, 20=0.01%, 50=0.01%
  cpu          : usr=0.20%, sys=1.31%, ctx=14062, majf=0, minf=12
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=14059,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=3749KiB/s (3839kB/s), 3749KiB/s-3749KiB/s (3839kB/s-3839kB/s), io=54.9MiB (57.6MB), run=15001-15001msec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature New feature request
Projects
None yet
Development

No branches or pull requests

1 participant