Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all articles
Browse latest Browse all 6226

Re: OpenMPI with MXM 32bit issue

$
0
0

Well, thats interesting.

The case on two hosts works fine:

$ /opt/openmpi-2.0.1-jessie-mxm-mt/bin/mpirun -np 2 -hostfile hostfile --map-by node --display-map -mca pml yalla openmpi_mxm_freeze

Data for JOB [31717,1] offset 0

========================  JOB MAP  ========================

Data for node: intel1  Num slots: 1    Max slots: 0    Num procs: 1

        Process OMPI jobid: [31717,1] App: 0 Process rank: 0 Bound: socket 0[core 0[hwt 0-1]]:[BB/../../../../../../../../../../..][../../../../../../../../../../../..]

Data for node: intel2  Num slots: 1    Max slots: 0    Num procs: 1

        Process OMPI jobid: [31717,1] App: 0 Process rank: 1 Bound: socket 0[core 0[hwt 0-1]]:[BB/../../../../../../../../../../..][../../../../../../../../../../../..]

=============================================================

[1474616276.871628] [intel1:7883 :0]        sys.c:744  MXM  WARN  Conflicting CPU frequencies detected, using: 2906.98

[1474616276.903256] [intel2:3181 :0]        sys.c:744  MXM  WARN  Conflicting CPU frequencies detected, using: 3043.73

0: ready to run

1: ready to run

...

0: finished

1: finished

 

while the one host case does not:

$ /opt/openmpi-2.0.1-jessie-mxm-mt/bin/mpirun -np 2 --map-by node --display-map -mca pml yalla openmpi_mxm_freeze

Data for JOB [31494,1] offset 0

========================  JOB MAP  ========================

Data for node: intel1  Num slots: 24  Max slots: 0    Num procs: 2

        Process OMPI jobid: [31494,1] App: 0 Process rank: 0 Bound: socket 0[core 0[hwt 0-1]]:[BB/../../../../../../../../../../..][../../../../../../../../../../../..]

        Process OMPI jobid: [31494,1] App: 0 Process rank: 1 Bound: socket 0[core 1[hwt 0-1]]:[../BB/../../../../../../../../../..][../../../../../../../../../../../..]

=============================================================

[1474615276.877829] [intel1:7723 :0]        sys.c:744  MXM  WARN  Conflicting CPU frequencies detected, using: 2971.04

[1474615276.877833] [intel1:7724 :0]        sys.c:744  MXM  WARN  Conflicting CPU frequencies detected, using: 2971.04

0: ready to run

1: ready to run

...

freeze

 

Since we are normally using a single host and just in extreme cases two or more hosts, a solution for the single host would be appreciated.


Viewing all articles
Browse latest Browse all 6226

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>