Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

evaluate NUMA memory binding #110

Open
psychocrypt opened this issue May 14, 2017 · 6 comments
Open

evaluate NUMA memory binding #110

psychocrypt opened this issue May 14, 2017 · 6 comments
Assignees

Comments

@psychocrypt
Copy link
Collaborator

Based on the discussion in #108 it could be useful to evaluate if there is any improvemnt if we can bind memory to the correct NUMA node.

@fireice-uk's observation

One of the points I noted when testing on multi-cpu, multi-ram systems (on Linux) is that the results varied by around 5% but were constant during the run, and usually during a run shortly after, whereas they were most likely to change after reboot.

@psychocrypt psychocrypt changed the title evaluate NUMA memory binf evaluate NUMA memory binding May 14, 2017
@psychocrypt
Copy link
Collaborator Author

@fireice-uk Could you please post the hwloc output hwloc-ls --output-format ascii from your laptop.

@fireice-uk
Copy link
Owner

┌─────────────────────────────────────────────────────────────────────────────┐
│ Machine (7690MB)                                                            │
│                                                                             │
│ ┌────────────────────────────────┐            ┌───────────────────────────┐ │
│ │ Package P#0                    │  ├┤╶─┬─────┤ PCI 8086:0a16             │ │
│ │                                │      │     │                           │ │
│ │ ┌────────────────────────────┐ │      │     │ ┌───────┐  ┌────────────┐ │ │
│ │ │ L3 (3072KB)                │ │      │     │ │ card0 │  │ renderD128 │ │ │
│ │ └────────────────────────────┘ │      │     │ └───────┘  └────────────┘ │ │
│ │                                │      │     │                           │ │
│ │ ┌────────────┐  ┌────────────┐ │      │     │ ┌────────────┐            │ │
│ │ │ L2 (256KB) │  │ L2 (256KB) │ │      │     │ │ controlD64 │            │ │
│ │ └────────────┘  └────────────┘ │      │     │ └────────────┘            │ │
│ │                                │      │     └───────────────────────────┘ │
│ │ ┌────────────┐  ┌────────────┐ │      │                                   │
│ │ │ L1d (32KB) │  │ L1d (32KB) │ │      │     ┌───────────────┐             │
│ │ └────────────┘  └────────────┘ │      ├─────┤ PCI 8086:1559 │             │
│ │                                │      │     │               │             │
│ │ ┌────────────┐  ┌────────────┐ │      │     │ ┌─────────┐   │             │
│ │ │ L1i (32KB) │  │ L1i (32KB) │ │      │     │ │ enp0s25 │   │             │
│ │ └────────────┘  └────────────┘ │      │     │ └─────────┘   │             │
│ │                                │      │     └───────────────┘             │
│ │ ┌────────────┐  ┌────────────┐ │      │                                   │
│ │ │ Core P#0   │  │ Core P#1   │ │      │               ┌───────────────┐   │
│ │ │            │  │            │ │      ├─────┼┤╶───────┤ PCI 8086:08b2 │   │
│ │ │ ┌────────┐ │  │ ┌────────┐ │ │      │               │               │   │
│ │ │ │ PU P#0 │ │  │ │ PU P#2 │ │ │      │               │ ┌────────┐    │   │
│ │ │ └────────┘ │  │ └────────┘ │ │      │               │ │ wlp3s0 │    │   │
│ │ │ ┌────────┐ │  │ ┌────────┐ │ │      │               │ └────────┘    │   │
│ │ │ │ PU P#1 │ │  │ │ PU P#3 │ │ │      │               └───────────────┘   │
│ │ │ └────────┘ │  │ └────────┘ │ │      │                                   │
│ │ └────────────┘  └────────────┘ │      │     ┌───────────────┐             │
│ └────────────────────────────────┘      └─────┤ PCI 8086:9c03 │             │
│                                               │               │             │
│                                               │ ┌─────┐       │             │
│                                               │ │ sda │       │             │
│                                               │ └─────┘       │             │
│                                               └───────────────┘             │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ Host: LAP440                                                                │
│                                                                             │
│ Indexes: physical                                                           │
│                                                                             │
│ Date: Sun 14 May 2017 21:35:42 BST                                          │
└─────────────────────────────────────────────────────────────────────────────┘

Not sure if the ascii art will get garbled.

@psychocrypt
Copy link
Collaborator Author

Could you please try to use the newest version of xmr-stak with hwloc? It looks like core 0 and 2 are the physicle cores in your system( not 0,1). The new code with the function to overload a system if 1MiB for a hash is avalable should also suggest to use core 0 and 2. This should remove the fluctuations in the hashrate.

I also used always core 0,1 until hwloc shows that 0,2 are the physical cores.

@fireice-uk
Copy link
Owner

fireice-uk commented May 21, 2017

I will give it a spin, and yes, seems you are correct on that point, we will need to properly allocate core affinity it seems:

cat /proc/cpuinfo | grep 'processor\|core id\|physical id'
processor	: 0
physical id	: 0
core id		: 0
processor	: 1
physical id	: 0
core id		: 0
processor	: 2
physical id	: 0
core id		: 1
processor	: 3
physical id	: 0
core id		: 1

@psychocrypt psychocrypt self-assigned this May 31, 2017
@psychocrypt
Copy link
Collaborator Author

@fireice-uk information: I am currently working on this issue. I have finished a prototype where the memory for a thread should be pinned to the numa node.
At the moment I can't see any benefit under linux. I will clean my code and create a pull request soon.

@fireice-uk
Copy link
Owner

Ok. Great, what's the setup you are testing on?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants