Experiment 1: Pack-living dogs and wolves
Study site and animals
We tested 10 grey wolves and 11 mongrel dogs (Table S1) housed at the Wolf Science Center (WSC, www.wolfscience.at/en/), Ernstbrunn, Austria, in a within subject design. All animals were hand-raised by animal professionals and separated from their mothers in their first week of life, experiencing continuous (24 h) contact with humans in the first 4–5 months of their life. After this period, regular positive reinforcement training sessions and behavioral or cognitive testing ensured that all animals experienced comparable levels of positive contact with humans throughout their lives. All dogs and wolves were housed in dyads or groups of up to four animals in large outdoor enclosures. In accordance with their species-specific requirements, dogs were fed daily with commercial dog kibble and wolves were provisioned with raw meat and carcasses of deer, rabbit, and chicken, 3–4 times a week. Data collection took place from May to early December 2017 and late April to November 2018, avoiding the wolf breeding season and its associated behavioral and hormonal changes. All male dogs and wolves were vasectomized aged approx. 6 months to prevent reproduction while maintaining their full endocrine and behavioral profiles. Female dogs were not sampled if they were in heat and male dogs were not sampled if their pack included a female in heat. In such cases, we waited until all behavioral and external physiological signs of the heat had disappeared before resuming testing.
Human participants
Twelve adult, female participants who either worked at the WSC as animal trainers (i.e., who were hand raisers of the animals, thus acting as the bonded interaction partners; N = 5) or as researchers (i.e. familiar people who did not have direct contact with the animals and were never involved in hand raising, thus acting as the familiar interaction partners for the animals; N = 7) volunteered to participate in the study. The perception of a stronger bond with the animal following the hand-raising experience is confirmed by the fact that animal trainers working at the WSC consistently rated their relationships with a particular animal higher if they were directly involved in its hand raising than if they were not.
Experimental design
To compare behavioral and hormonal responses of pack-living dogs and wolves following close social contact with different human partners, all animals were tested in four conditions: (1) a ‘baseline’ (i.e., the focal animals were resting in their home enclosures), (2) a dyadic social interaction test with a familiar human partner, (3) and with a bonded human partner, and (4) a food control (Table 2). Comparing the social interaction conditions with the food control allowed us to evaluate whether physical interaction per se was responsible for specific hormonal changes. Comparing the two social conditions allowed us to assess if oxytocin and glucocorticoid release were mediated by physical contact time and/or relationship strength with the human partner.
Testing was conducted in a semi-randomized and counterbalanced order. The interaction tests involved a ‘cuddle session’ which lasted 5 min and during which animals stayed in their familiar home enclosures and were free to approach the fence to be petted by the human partner (Fig. 1 A-B; Movies S1 and S2). This was followed by a short training session, during which animals were asked to respond to known commands in exchange for a food reward (15 pieces of sausage, 1 × 1 × 1 cm). The exact same amount of food was given to the animals in the food control by a third person (an animal keeper who was not involved in hand raising and did not participate in the dyadic social interaction tests), but without any physical interaction or verbal appraisal. Prior to testing, focal animals were not fed and did not participate in other tests or activities for at least 2 h. Each test was preceded by 60 min of behavioral observations of the focal animals to ensure no major disruptions occurred which could have affected subsequent hormonal measures. If any disruptions (e.g., aggression within the pack, playing and grooming bouts, or external stimuli such as passing visitors, trainers, students, or researchers, that caused an observable behavioral change in the focal pack) occurred, the testing was re-scheduled for another day. Each animal was tested only once per day and provided two baseline samples on separate occasions.
Human participants were tested with at least one wolf or one dog and in the respective control conditions (animal present, no interaction; animal not present) (Table 2). The ‘animal present, no interaction’ control was analogous to the interaction condition but no interaction (no touching, talking or gazing) between the human and the animal took place. The human participant walked to the animal’s home enclosure together with the experimenter where she sat down quietly in the air lock compartment, facing away from the animal to avoid eye contact. She stayed there for the same duration as the interaction lasted in the social interaction condition. Following a 60 min waiting period during which she was not allowed to interact with other humans or animals (computer work, reading, or resting was allowed), she donated the post urine sample. For the ‘animal not present’ control, the human participant provided a pre-test urine sample, and then proceeded to work on a computer, read, or rest for 60 min, without any contact to animals or humans. Then she donated the post urine sample.
Behavioral data collection
Social interaction tests were filmed, and videos were subsequently coded with Solomon behavioral coding software (version beta 17.03.22, copyright András Péter, https://solomon.andraspeter.com) using a canid ethogram (Table S2). To control for small variation in the total duration of the interaction session and varying amounts the animals spent visible on the videos (i.e., all subjects were free to move around in the enclosures during testing while the camera was positioned on a fixed spot), we first calculated the ‘total time in sight’ by subtracting the time spent out of sight from the total duration of the session and then normalized durations of the time spent in body contact (duration in body contact/total time in sight) and the rate of self-directed behaviors (SDB) per second (sum of yawning, lip licking, and head or body shakes/total time in sight) for the ‘total time in sight’ for further statistical analyses. We subsequently refer to the ‘total time in sight’ as the ‘interaction time’. Inter-observer reliability (IOR) testing was conducted on 20% of the interaction videos scored by two independent observers using the package irr (version 0.84.1) in R, version 4.0.2. This revealed very high reliability for all behaviors: Duration spent in proximity to human partner, interclass correlation coefficient (ICC) = 0.99, P < 0.01, 95% confidence intervals (CI) 0.92–0.99; duration spent in body contact with human partner, ICC = 0.99, P < 0.01, CI 0.79–0.99; sum of self-directed behaviors, ICC = 0.93, P < 0.01, CI 0.47–0.99.).
Urine sample collection and hormone measurement
Spontaneously voided urine samples were collected non-invasively from all dogs and wolves during leashed walks using an expandable metal stick with a plastic cup attached to it. Prior to testing all animals were habituated to the urine sampling procedure. All animals were taken on urine collection walks 60 min after testing and we collected the first urine they voided. Samples used for analysis were collected on average 74 min after testing (SD 8.5 min, range 60–104 min; comparable to previous studies:29,44,50). The baseline samples were collected following at least 60 min of undisturbed resting in the familiar home enclosures. Human urine samples were taken by the participants themselves immediately before and 60 min after the interaction test.
Upon collection, all samples were split into four 1 ml aliquots. 100 µl of a 0.1% phosphoric acid (PA) was added to two aliquots to prevent oxytocin degradation in the samples and optimize conditions for storage49,50,51,52. One 1 ml aliquot of each sample was kept without PA to measure urinary specific gravity (SG; to account for urine dilution in wolf and dog samples), urinary creatinine (crea; to control for urine dilution in human samples) and glucocorticoids. All samples were frozen at −20 °C within 15 min of collection. Solid-phase extractions (SPE) following a previously validated protocol was performed for urinary OTM measurement, and diethyl-ether extractions for urinary GCM. Extracted samples were sealed and shipped on dry ice to the Max-Planck-Institute for Evolutionary Anthropology, Leipzig, Germany, for OTM measurement, and to the Unit of Physiology, Pathophysiology and Experimental Endocrinology, University of Veterinary Medicine, Vienna, Austria, for GCM measurement. For urinary OTM measurement, we used a commercially available enzyme-immunoassay (EIA) kit (Arbor Assays, Ann Arbor, Cat.No: K048-H5). The assay was analytically and physiologically validated for our study species and sample matrix. The assay standard curve ranged from 16.38 to 10 000 pg/ml and assay sensitivity was 17.0 pg/ml. Intra-assay coefficients of variation (CVs) were 8.6% (dog/wolf samples) and 9.5% (human samples). Inter-assay CVs of high and low concentration quality controls (QCs) were 9.7% (high) and 12.4% (low). For urinary GCM measurement, we used an in-house cortisol EIA with an antibody produced against cortisol-21-HS:BSA, previously validated for our purpose. The assay standard curve ranged from 2 to 200 pg/well and assay sensitivity was 2 pg/well. Intra- and inter-assay CVs were 5.3% and 7.5%, respectively. All samples were measured in duplicates and repeated if optical density (OD) values differed more than 10%. Wolf and dog urinary OTM and GCM concentrations were corrected for variable water content of the samples using urinary SG, measured with a digital refractometer, as previously described50,51,52. SG corrected hormone concentrations are expressed as OTM pg/ml SG and GCM ng/ml SG. Human urinary OTM concentrations were expressed as pg/mg crea to account for variable concentration and volume of the voided samples.
Experiment 2: Pet dogs
Experimental design
The surprisingly few differences in behavioral and hormonal responses to social interactions with humans of pack-living dogs and wolves strikingly contrast with recent findings suggesting the latter may have been biased by the use of pet dogs instead of animals with similar life experiences. Hence, we conducted a second experiment with pet dogs using the same test paradigm: Ten female, adult dog owners and their pet dogs (5 males and 5 females; mean (SD) age 6.9 (4.1) years; Table S8) were recruited from the environment of the researchers and volunteered to participate in the current study. Pet dog-owner dyads were chosen on the basis that they were highly familiar with the testing environment at the WSC to keep conditions as similar and comparable as possible to the pack-living animals. Female colleagues who were familiar with the dogs (i.e., had interacted with them before on multiple occasions) were recruited to act as familiar interaction partners for the pet dogs. All dogs had lived with their owners for at least 6 months prior to the start of the study and continued to do so after the study ended.
To keep procedures consistent with Experiment 1, we tested the pet dogs outdoors in two settings: (1) in an enclosure-type setting with a fence between the human partner and the dog, and (2) in an open space, without a fence, because we hypothesized that the presence of a fence may affect the dogs’ interest in approaching and staying close to the humans, particularly in pet dogs who are not used to this mode of interaction. The test locations were familiar to the pet dogs and they were given several minutes to explore their surroundings before testing started. All pet dogs were tested in both settings (i.e., with/without the fence), with their owners, a familiar person, and in the food condition. One dog completed only four tests due to owner time constraints and unavailability of the dog for further tests. The baseline measure was taken following a 60 min resting period in their familiar environment (i.e., the owner’s office). The social interaction tests and food control were performed analogous to Experiment 1. To eliminate possible order effects, testing was conducted in a semi-randomized and counterbalanced order across subjects, and each pet dog was tested only once per day. The dogs were not fed, walked, or otherwise interacted with during the 2 h prior to the start of the test. Each test was preceded by a 60 min resting period in their familiar environment with the owner present but instructed not to interact with the dog.
Behavioral and hormonal data collection and sample treatment was identical to Experiment 1, as described above.
Statistical analyses
We fitted generalized and linear mixed models (GLMM, LMM) in R statistical software, version 4.0.2, https://www.R-project.org), using the function ‘lmer’ of the package lme4 (version 1.1–21), with the optimizer ‘bobyqa’, and the function ‘glmmTMB’ of the package glmmTMB (version 1.0.2.1, 56). We included all identifiable random slopes57,58, which were manually dummy-coded and centered. To keep type I error rate at 5%, we compared all full models with a respective null model lacking only the test predictor but comprising the control predictors and complete random effects structure using a likelihood ratio test. In case a higher order interaction term did not reveal significance, reduced models lacking that interaction term, but comprising all relevant lower order interactions, or main effect terms, were fitted. Model stability was assessed by comparing the estimates obtained from the model based on all data with those obtained from models with the levels of the random effects excluded one at a time. This revealed good model stability except for the model on SDB rates in pack-living dogs and wolves which indicates high uncertainty in results obtained from this model (Table S5) and warrants cautious interpretation. We performed parametric bootstrapping to obtain confidence intervals (function ‘bootMer’ of lme4) and assessed collinearity with the function ‘vif’ of the package car (version 3.0-0, 61), applied to a model lacking the random effects. This revealed no issues with collinearity in any of the models. For a detailed description of test and control predictors, as well as random effects included in fitted models, and all model output tables, see supplementary material.
A total of 104 samples 50 wolf and 54 pack dog samples were used for statistical analyses of urinary OTM concentrations across all test conditions. A total of 42 samples (20 wolf and 22 pack dog samples) were used for statistical analysis of OTM concentrations and 40 samples (19 wolf and 20 pack dog samples) were used for analysis of GCM concentrations following the social interaction tests. 204 human urine samples comprising 102 pairs of matched pre- and post-test samples were used for statistical analyses of human urinary OTM concentrations. A total of 67 pet dog urine samples were used for statistical analysis of urinary OTM concentrations and 61 for GCM concentrations.
Ethics statement
This study was discussed and approved by the institutional Ethics and Animal Welfare Committee at the University of Veterinary Medicine Vienna, in accordance with Good Scientific Practice and ARRIVE guidelines and national as well as EU legislation (Protocol number ETK-05/03/2017). The human part was discussed and approved by the Ethics Committee at the Medical University of Vienna (Protocol number 1769/2017) and performed in accordance with the relevant guidelines and regulations. Written informed consent was obtained from each participant after full explanation of the nature of all procedures used. Written informed consent was also obtained from each participant to publish images and findings in an open-access publication.