### Data collection

The study was conducted in April-June 2020 in 15 wards in university hospital centres in Paris, Lyon and Bordeaux, selected for their diversity of clinical activity (details in Supplementary Table S1). Each ward was studied for approximately 36 h, starting with the nurses’ day shift in the morning of day 1 and finishing at the end of the day shift on day 2. All individuals initially present in the ward were offered sensors, as were all subsequent arrivals to the ward. At the end of the study period or on the participant’s departure, the sensor was returned. The age and function (patient, visitor, or type of health professional) of the individual was recorded, as well as the time period within which the sensor was carried. The wearable sensors (shown in Supplementary Figure S1) transmitted a signal every 10 s (a “Hello” packet), and recorded the identity of other transmitting sensors within a range of about 1.5m. The reception of a single Hello packet, whether recorded by both sensors or just one, was assumed to represent 10 s of contact between the two sensors. Participants either kept the sensor in a pocket or on a pendant around the neck. For patients assigned to their room (COVID-19 patients, intensive care patients or neonates), they were hung on a fixed part of their bed.

### Contact analysis

The first step in the data analysis was to calculate summary statistics of contact, for each individual and then at the ward level between hospital users of different status (patient, visitor or HCW). The contact matrices summarise the amount of contact between each status of individual (patient, visitor and HCW) for each ward. The contact intensity and contact rate per hour, and the average duration of each contact, were calculated for individuals of status *y* with those of status *x*.

Contact intensity was calculated for each individual as the total recorded cumulative contact minutes divided by the number of hours that individual spent carrying the sensor. The contact intensity \({k}_{xy}\) is the total cumulative time an individual of status *x* spent in contact with individuals of status *y* per hour on the ward, and is calculated as in Eq. (1) where \({n}_{x}\) is the number of individuals of status *x* on the ward, *i* is an individual of status *x*, \({t}_{i}\) is the number of hours this individual spent carrying the sensor, *j* is an individual of status *y*, \({C}_{iy}\) is the number of unique individuals of status *y* contacted by *i*, and \({d}_{ij}\) is the total duration of their contact over the study period.

$${k}_{xy}=\frac{1}{{n}_{x}}\sum_{i}^{{n}_{x}}\frac{1}{{t}_{i}}\sum_{j}^{{C}_{iy}}{d}_{ij}$$

(1)

Similarly, individual contact rate was the number of unique persons contacted by that individual, per hour carrying the sensor. Average contact rate per hour \({c}_{xy}\) for individuals of status *x* with those of status *y*, is calculated by Eq. (2), as the number of unique contacts of status *y* for individual *i* divided by their time with the sensor \({t}_{i}\), and averaged over all individuals *i* of status *x*.

$${c}_{xy}=\frac{1}{{n}_{x}}\sum_{i}^{{n}_{x}}\frac{{C}_{iy}}{{t}_{i}}$$

(2)

Individual average contact duration was the total cumulative contact minutes divided by the number of persons contacted. The average duration of a contact that status *x* has with status *y,* \({d}_{xy}\), is calculated as in Eq. (3) by first taking the average duration of all contacts an individual *i* of status *x* has with individuals *j* of status *y*, divided by all individuals of that status contacted, \({C}_{iy}\). The average of this value is then taken across all individuals *i* of status *x*.

$${d}_{xy}=\frac{1}{{n}_{x}}\sum_{i}^{{n}_{x}}\frac{1}{{C}_{iy}}\sum_{j}^{{C}_{iy}}{d}_{ij}$$

(3)

The mean of each of these measures (contact rate, contact intensity and contact duration) was then calculated for each ward and between each combination of status and provided in contact matrices.

### Epidemic risk

To examine how these ward-level values translate to epidemic risk, we wrote a transmission model to predict the number of secondary infections which would occur per day from a hypothetical SARS-CoV-2 index case if all contacts were susceptible. For each ward, we calculated the total number of expected contacts per day from the average contact rate per hour, \(\overline{c }\) (Eq. 4) in which *n* is the total number present, \({C}_{i}\) is the total number of contacts for individual *i*.

$$\overline{c }=\frac{1}{n}\cdot \sum_{i}^{n}\frac{{C}_{i}}{{t}_{i}}$$

(4)

We also calculated the average time spent on the ward per 24-h period, \(\overline{H }\) (Eq. 5), using their time carrying the sensor as a proxy, and where *T* is the total duration of the study on that ward.

$$\overline{H }=\frac{24}{n\cdot T}\sum_{i}^{n}{t}_{i}$$

(5)

We assumed that the probability of infection per contact increased with duration of contact, and with a diminishing increase for longer contacts^{22}. The overall probability of infection per contact, \(\overline{{p }_{\text{inf}}}\) (Eq. 6), was calculated from the mean probability of infection per contact for each individual *i* across all of their contacts *j*, where the probability of infection between two individuals (Eq. 7) is determined by the duration of contact \({d}_{ij}\) and a shape parameter *a*, for which higher values are associated with a steeper increase of infection probability as contact duration increases (Supplementary Figure S5). For the baseline analysis, a value of *a* = 0.1 is used, representing a 50% probability of infection after 11 h in contact.

$$\overline{{p }_{\text{inf}}}=\frac{1}{n}\cdot \sum_{i}^{n}\frac{1}{{C}_{i}}\cdot \sum_{j}^{{C}_{i}}{{p}_{\text{inf}}}_{ij}$$

(6)

$${{p}_{\text{inf}}}_{ij}=\frac{1-{e}^{-{d}_{ij}\cdot a}}{1+{e}^{-{d}_{ij}\cdot a}}$$

(7)

The expected number of secondary infections per day, *M*, was then computed as the product of these three quantities (Eq. 8):

$$M=\overline{c }\cdot \overline{H }\cdot \overline{{p }_{\text{inf}}}$$

(8)

Specific predictions of numbers of secondary infections per day between different status of hospital user (patients, visitors and HCWs) were calculated using the same approach. The number of secondary infections from an index infection of status *x* towards individuals of status *y* is predicted as \({M}_{xy}\) (Eq. 9).

$${M}_{xy}={c}_{xy}\cdot {H}_{x}\cdot {p}_{{\text{inf}}_{xy}}$$

(9)

where \({c}_{xy}\) is the contact rate per hour between *x* and *y* (Eq. 2), \({p}_{{\text{inf}}_{xy}}\) is the probability of infection in contacts between *x* and *y* (Eq. (10), using \({{p}_{\text{inf}}}_{ij}\) from Eq. (7)), and \({H}_{x}\) is the average time spent on the ward by individuals of status *x* (Eq. 11).

$${p}_{{\text{inf}}_{xy}}=\frac{1}{{n}_{x}}\cdot \sum_{i}^{{n}_{x}}\frac{1}{{C}_{iy}}\cdot \sum_{j}^{{C}_{iy}}{{p}_{\text{inf}}}_{ij}$$

(10)

$${H}_{x}=\frac{24}{{n}_{x}\cdot T}\sum_{i}^{{n}_{x}}{t}_{i}$$

(11)

Finally, the overall number of secondary infections from an index case of status *x* to any status of individual is calculated by summing \({M}_{xy}\) over all status *y* (Eq. 12).

$${M}_{x}=\sum_{y}{M}_{xy}$$

(12)

### Simulated interventions

We used this model to predict the effect of control measures targeting the most connected individuals by repeating this calculation of epidemic risk, *M*, but with the highest risk individuals being neither susceptible nor capable of transmitting. We selected the 5% of the population with either the most unique contacts over the whole study period, or the highest cumulative contact hours. The probability of infection from or to these individuals was set to zero. We also evaluated the targeting of only individuals of a single status, e.g. highly connected patients, ensuring for comparability that the number targeted still made up 5% of the total population. The reduction in daily risk was calculated as a proportion of the baseline risk in which nobody was targeted (Eq. (13)).

$$\text{Relative reduction}=\frac{{M}_{\text{baseline}}-{M}_{\text{targeted}}}{{M}_{\text{baseline}}}$$

(13)

We tested the sensitivity of the simulation analysis to the proportion of the population targeted (over the range 0%-20%) and the shape parameter *a* which drives the increase in the infection probability for longer contacts (over the range 0.05–0.5).

All analyses were conducted using *R 4.2.0*^{23}, with network analyses conducted using *igraph*, and graphics produced using *ggplot2*. The code used for each analysis and visualisation is available at https://github.com/georgeshirreff/nodscov2_risksim.

### Ethics approval and consent to participate

This research was approved by the Comités de protection des personnes (CPP) Ile-de-France VI on 14/04/2020 and the Commission nationale de l’informatique et des libertés (CNIL) on 16/04/2020. Signed consent by patients, medical and administrative staff, and visitors was not required according to the CPP and CNIL, but participants could refuse to participate. When patients were minors, unable to refuse or under guardianship, parents, family or guardians, respectively, were asked. The study was carried out in accordance with the Declaration of Helsinki.