Kaikki aineistot
Lisää
Vuonna 1958 perustettu pesäpalloseura Kankaanpään Maila (KaMa) on ollut perustamisestaan lähtien keskeinen toimija Kankaanpään urheiluelämässä ja yksi niistä tekijöistä, joista Kankaanpää tunnetaan. Tutkimus käsittelee sitä kehityskulkua, mikä on mahdollistanut seuran menestyksekkään aseman alueen urheilu- ja sosiaalisessa elämässä ja maanlaajuisena pesäpallokeskuksena vuosikymmenten ajan ja selvittää pesäpalloseuran merkitystä paikallisen yhteisöllisyyden ja identiteetin rakentajana. Tutkimuksessa selvitetään sekä seuratoiminnan haasteita Kankaanpäässä 1950-luvun lopulta saakka että sitä, millä mekanismeilla KaMa on haasteista selvinnyt ja miten se on organisaationa muuttunut. Tuloksena piirtyy kuva niistä alueen ominaisuuksista ja toimijoiden valinnoista, joiden yhteisvaikutuksena pesäpallo ja Kankaanpää ovat liittyneet yhteen. Kokonaiskuvan kirkastamiseksi käsitellään koko alueen urheiluelämää 1900-luvun alusta lähtien ja seurataan 2000-luvulle asti sitä toimintakenttää, jolla KaMa ja alueen muut seurat ja yhdistykset ovat toimineet. Olennainen kysymys on myös, onko Kankaanpäässä ollut vaihtoehtoa ja kilpailua pesäpallolle. Jotta yksittäinen seura ja kunta asettuvat ja linkittyvät isompaan sosio-kulttuuriseen kokonaisuuteen, tar- kastellaan aluksi sitä kehityskulkua, mikä johti varhaisista leikeistä ja peleistä urheilun kilpailullis- tumiseen ja asemaan maanpuolustuksellisena voimavarana. Myös pesäpallon isän ja isänmaalli- suusajattelijan, Lauri ”Tahko” Pihkalan, kasvatusfilosofian lyhyt esittely on relevanttia kehityksen ymmärtämiseksi. Historiallista taustaa, Kankaanpään case-esimerkkiä ja maan yleisen palloilume- nestyksen maantieteellisen sijoittumisen tietoja käyttämällä saadaan vertailua, jolla voidaan osoittaa syitä, miksi tietyt lajit ovat muodostuneet perinteisiksi menestyslajeiksi tietyillä alueilla. Keskeisenä ankkurina kunnallisen ja maanlaajuisen urheiluelämän kehityksessä ovat yleiset yhteis- kunnalliset murrokset, kuten ruumiinkulttuurin muutos urheilua ja liikuntaa suosivampaan suuntaan, kuntoliikunnan suosion nousu ja urheilun kaupallistuminen, tuotteistuminen ja ammattimaistuminen. Liikuntakulttuurin eriytyminen ja lajien lisääntyminen on vaatinut toimijoita mukautumaan ja soveltamaan sekä etsimään uusia yhteistyön ja toiminnan muotoja. Case KaMan ja aluetutkimuksen kautta perinteisen ja alueelle juurtuneen seuran historiaa ja selviytymiskeinoja tutkimalla saadaan näkyviin yksi esimerkki muuntautumisesta, selviämisestä ja pysyvyydestä muuttuvan yhteiskunnan ja toimintaympäristön keskellä. Muutoksen keskeltä hahmottuu ja tiivistyy KaMa-identiteetin keskeinen olemus ja merkitys. Tutkimuksen kirjallisena aineistona on ollut pesäpallon ja urheilun historiaa käsittelevää kirjallisuutta ja tutkimusta, joista erityisesti Hannu Itkosen yksin ja yhteistyössä muiden kanssa tekemät liikuntakulttuurin muutoksia käsittelevät tutkimukset ovat olleet korvaamattomia. Varsinkin median, yleisön, talouden ja urheilullisen menestymisen riippuvuussuhde linkittyy vahvasti omankin tutkimukseni päätelmiin. KaMan ja Kankaanpään urheiluelämän historiasta tietoa tarjosivat KaMan ja Kankaanpään kaupungin arkistot. Tutkimuksessa on käytetty laajasti myös yksittäisten henkilöiden lehtileikekokoelmia. KaMa-identiteetin selvitystä silmälläpitäen kenties kaikkein hedelmällisin lähdeaineisto olivat haastattelut. Tutkimukseen on haastateltu kymmeniä entisiä pelaajia, pelinjohtajia ja muita seuratyöntekijöitä kasvotusten, sähköpostitse ja kirjeitse. Haastatteluaineisto on luonteeltaan muistitiedollista, ja sen arvo onkin yhtä lailla siinä, miten muistellaan ja mitä muistetaan kuin tiedon ehdottomassa oikeellisuudessa. Keskeisenä tehtävänä oli tarkastella ja tulkita, oliko ja onko olemassa yhteistä KaMa-identiteettiä ja jos se löytyi, hahmottaa ja analysoida sen tunnuspiirteitä. Tärkeä selvityskohde oli myös pesäpallon merkitys koko Kankaanpään identiteetille. Käytin tutkimuksessa myös tilastollisia menetelmiä saadakseni kuvaa Kankaanpäästä urheilukeskuksena. Vertailemalla eri palloilulajien maratontaulukkoja oli mahdollista saada selville, millä alueilla on syntynyt minkäkin lajin traditio – tai on jäänyt syntymättä. Vertailemalla näiden alueiden väkilukuja ja elinkeinorakennetta olen tehnyt päätelmiä siitä, millaisissa oloissa lajit ovat juurtuneet ja menestyneet. Tutkin myös joukkueiden katsojakeskiarvoja, mikä on yksi mittari tarkasteltaessa joukkueen kiinnostavuutta alueella. KaMan kohdalla vertasin katsojakeskiarvoja urheilulliseen menestykseen ja sain selville, että joukkueen menestys on keskeisin väkeä katsomoihin tuova tekijä. Maaseutusijainti ei ole edellytys menestyvälle pesäpallokeskukselle, mutta talousalueella on kokonsa huomioon ottaen mahdollisuus ylläpitää rajallista määrää menestyviä lajikulttuureja. Helsingissä ja Tampereella tukijoita ja harrastajia riittää jääkiekkoon ja jalkapalloon, mutta ei enää pesäpalloon. Kankaanpäässä ei ole sponsorirahaa ja harrastajia kuin yhdelle, markkinallisuusasteeltaan valtalajeja pienemmälle lajille, pesäpallolle. Tahko Pihkalan kansan kuntokouluksi kehittelemä pesäpallo otettiin suojeluskunnan ohjelmistoon, jonka piirissä se levisi maaseudulle ja juurtui sinne. Tämä oli keskeinen tekijä pesäpallon säilymisen ja kansallispelin statuksen saavuttamisen kannalta. Maalaisjoukkueet pärjäsivät kaupunkilaisille, mikä lisäsi intoa ja itsevarmuutta. Kankaanpäässä haluttiin myös pelata pesäpalloa, vieläpä korkealla tasolla, mikä johti kaksi poliittisesti eri laidoilla olevaa seuraa ja niiden toimijoita yhdistämään voimansa. Tärkeä motiivi oli edustaa aluetta menestyksellisesti sen ulkopuolella, ja pian tavoitteeksi tuli myös saattaa mahdollisimman paljon nuoria ja lapsia harrastuksen pariin. Kankaanpäässä oli vireää urheilutoimintaa jo ennen KaMaa, mutta joukkuelajille oli kysyntää. KaMan historian voi tiivistää vuosikymmenten mittaiseksi taisteluksi sekä urheilullisesta menestyksestä että resursseista. Vapaaehtoisvoimin toimivalla urheiluseuralla ei ole koskaan ollut taloudellisesti helppoa, ja rajatulla toiminta- ja talousalueella aktiivisten vapaaehtoisten seuratoimijoiden ja myötämielisten taloudellisten tukijoiden merkitys on ollut korvaamaton. Pesäpallolle oli kilpailua, mutta KaMa pystyi organisaationa vastaamaan resurssihaasteisiin ja saamaan tarpeeksi harrastajia ja katsojia ja kasvattamaan ja hankkimaan tarpeeksi hyviä edustusjoukkueen pelaajia, jotta seura saavutti menestystä, mikä taas ruokki kiinnostusta ja vahvensi traditiota. Koko ajan ratkaisevana tekijänä oli kuitenkin pienilukuisen mutta ahkeran vapaaehtoisjoukon työpanos. Pesäpallosta ja Kankaanpään Mailasta kasvoi osa Kankaanpään identiteettiä. Pesäpallojoukkue oli tekijä, josta alue tunnettiin sen ulkopuolella. Suurin menestys ajoittui 1970-luvulle, mutta traditio on kantanut sopupeliskandaalista ja sarjaputoamisista huolimatta. Liikuntakulttuurin yleinen kaupallistuminen ja pirstaloituminen, pelaajien ammattimaistuminen ja kansalaisten harrastusmahdollisuuksien lisääntyminen ovat lisänneet seuratyön haasteita, mutta improvisoinnin ja ahkeruuden avulla, nojaten lojaaleihin ja velvollisuudentuntoisiin seuratyöläisiin ja tukijoihin, KaMa on pärjännyt ja selvinnyt niin sopupeliskandaalista kuin putoamisista sarjatasoa alemmas. Kama-identiteetin ytimessä on yhteinen historia ja traditio. Se, että perinteestä tietoiset ja jatkuvuutta haluavat toimijat pyrkivät ylläpitämään katsojista, harrastajista, ammattipelaajista ja tukijoista muodostunutta yhteisöllistä verkostoa, ottavat vastuun organisaatiosta ja pyrkivät tekemään siitä yhteisöllisen, kiinnostavan ja taloudellisesti ja kasvatuksellisesti menestyvän ”tuotteen”, on ollut ja on KaMa-yhteisön ja organisaation toiminnan jatkuvuuden ensimmäinen edellytys. Asiasanat:pesäpallo, Kankaanpää, urheilu
Opetusministeriön vipunen.fi-palvelun luvut antavat ensisilmäyksellä täysin väärän käsityksen tilastotieteestä valmistuneiden työllisyydestä. Yhdistämällä raportin vuodet 2009–2016 käy kuitenkin ilmi, että noin 5 % tilastotieteestä valmistuneista on ollut työttömänä vuosi valmistumisen jälkeen. Harhaanjohtavien lukujen syyksi paljastuu raporttiin sovellettu tietosuojaus.
Do-calculus is concerned with estimating the interventional distribution of an action from the observed joint probability distribution of the variables in a given causal structure. All identifiable causal effects can be derived using the rules of do-calculus, but the rules themselves do not give any direct indication whether the effect in question is identifiable or not. Shpitser and Pearl (2006b) constructed an algorithm for identifying joint interventional distributions in causal models, which contain unobserved variables and induce directed acyclic graphs. This algorithm can be seen as a repeated application of the rules of do-calculus and known properties of probabilities, and it ultimately either derives an expression for the causal distribution, or fails to identify the effect, in which case the effect is non-identifiable. In this paper, the R package causaleffect is presented, which provides an implementation of this algorithm. Functionality of causaleffect is also demonstrated through examples.
Causal models communicate our assumptions about causes and e ects in real-world phenomena. Often the interest lies in the identification of the e ect of an action which means deriving an expression from the observed probability distribution for the interventional distribution resulting from the action. In many cases an identifiability algorithm may return a complicated expression that contains variables that are in fact unnecessary. In practice this can lead to additional computational burden and increased bias or ine ciency of estimates when dealing with measurement error or missing data. We present graphical criteria to detect variables which are redundant in identifying causal e ects. We also provide an improved version of a well-known identifiability algorithm that implements these criteria.
Obtaining a non-parametric expression for an interventional distribution is one of the most fundamental tasks in causal inference. Such an expression can be obtained for an identifiable causal effect by an algorithm or by manual application of do-calculus. Often we are left with a complicated expression which can lead to biased or inefficient estimates when missing data or measurement errors are involved. We present an automatic simplification algorithm that seeks to eliminate symbolically unnecessary variables from these expressions by taking advantage of the structure of the underlying graphical model. Our method is applicable to all causal effect formulas and is readily available in the R package causaleffect.
The mobile phone data were collected in February 2013 together with the National Consumer Net Shopping Study conducted by market research company Tietoykkönen Oy. The target group was 15--79 years old mobile phone owners in Finland. The data collection method was telephone interviews by using a computer-assisted telephone interviewing (CATI) system. The sample source was targeting service Fonecta Finder B2C, which contains all publicly available phone numbers in Finland. Random sampling was made by setting quotas in respondents’ gender, age and region in the major region level excluding Åland autonomic region. The sample size was 536 completed interviews. All 536 survey respondents had a mobile phone. The respondents answered the following questions (originally in Finnish): What is the brand of your mobile phone? When did you purchase your mobile phone? (year and month; if the month was not recalled the season was asked) What was the brand of your previous mobile phone? When did you purchase your previous mobile phone? (year and month; if the month was not recalled the season was asked) Which brand would be the most interesting for you if you were to buy a mobile phone now? Is your mobile phone a smart phone, a feature phone with an internet connection or a phone without an internet connection? In addition, the respondents where asked for their gender, age group (six categories: 15--24, 25--34, 35--44, 45--55, 55--64 and 65--79 years), geographical region (Helsinki-Uusimaa, Southern Finland, Western Finland, Northern & Eastern Finland) and income of the household (five categories: 30,000 euros or less, 30,001--50,000 euros, 50,001--70,000 euros, over 70,000 euros and no answer). Citation: Cannot be used without citation to J. Karvanen, A. Rantanen, L. Luoma, Survey data and Bayesian analysis: a cost-efficient way to estimate customer equity. Quantitative Marketing and Economics, DOI:10.1007/s11129-014-9148-4, 2014. (preprint available at http://arxiv.org/pdf/1304.5380)
Identification of causal effects is one of the most fundamental tasks of causal inference. We consider an identifiability problem where some experimental and observational data are available but neither data alone is sufficient for the identification of the causal effect of interest. Instead of the outcome of interest, surrogate outcomes are measured in the experiments. This problem is a generalization of identifiability using surrogate experiments [1] and we label it as surrogate outcome identifiability. We show that the concept of transportability [2] provides a sufficient criteria for determining surrogate outcome identifiability for a large class of queries.
We propose an approach for the planning of longitudinal covariate measurements in follow-up studies where covariates are time-varying. We assume that the entire cohort cannot be selected for longitudinal measurements due to financial limitations, and study how a subset of the cohort should be selected optimally, in order to obtain precise estimates of covariate effects in a survival model. In our approach, the study will be designed sequentially utilizing the data collected in previous measurements of the individuals as prior information. We propose using a Bayesian optimality criterion in the subcohort selections, which is compared with simple random sampling using simulated and real follow-up data. Our work improves the computational approach compared to the previous research on the topic so that designs with several covariates and measurement points can be implemented. As an example we derive the optimal design for studying the effect of body mass index and smoking on all-cause mortality in a Finnish longitudinal study. Our results support the conclusion that the precision of the estimates can be clearly improved by optimal design.
Science can be seen as a sequential process where each new study augments evidence to the existing knowledge. To have the best prospects to make an impact in this process, a new study should be designed optimally taking into account the previous studies and other prior information. We propose a formal approach for the covariate prioritization, i.e., the decision about the covariates to be measured in a new study. The decision criteria can be based on conditional power, change of the p-value, change in lower confidence limit, Kullback-Leibler divergence, Bayes factors, Bayesian false discovery rate or difference between prior and posterior expectation. The criteria can be also used for decisions on the sample size. As an illustration, we consider covariate prioritization based on genome-wide association studies for C-reactive protein levels and make suggestions on the genes to be studied further.
An iterative Bayesian optimisation technique is presented to find spatial designs of data that carry much information. We use the decision theoretic notion of value of information as the design criterion. Gaussian process surrogate models enable fast calculations of expected improvement for a large number of designs, while the full-scale value of information evaluations are only done for the most promising designs. The Hausdorff distance is used to model the similarity between designs in the surrogate Gaussian process covariance representation, and this allows the suggested algorithm to learn across different designs. We study properties of the Bayesian optimisation design algorithm in a synthetic example and real-world examples from forest conservation and petroleum drilling operations. In the synthetic example we consider a model where the exact solution is available and we run the algorithm under different versions of this example and compare it with existing approaches such as sequential selection and an exchange algorithm.
In epidemiological surveys, data missing not at random (MNAR) due to survey nonresponse may potentially lead to a bias in the risk factor estimates. We propose an approach based on Bayesian data augmentation and survival modelling to reduce the nonresponse bias. The approach requires additional information based on follow-up data. We present a case study of smoking prevalence using FINRISK data collected between 1972 and 2007 with a follow-up to the end of 2012 and compare it to other commonly applied missing at random (MAR) imputation approaches. A simulation experiment is carried out to study the validity of the approaches. Our approach appears to reduce the nonresponse bias substantially, whereas MAR imputation was not successful in bias reduction.
Graphs are commonly used to represent and visualize causal relations. For a small number of variables, this approach provides a succinct and clear view of the scenario at hand. As the number of variables under study increases, the graphical approach may become impractical, and the clarity of the representation is lost. Clustering of variables is a natural way to reduce the size of the causal diagram, but it may erroneously change the essential properties of the causal relations if implemented arbitrarily. We define a specific type of cluster, called transit cluster, that is guaranteed to preserve the identifiability properties of causal effects under certain conditions. We provide a sound and complete algorithm for finding all transit clusters in a given graph and demonstrate how clustering can simplify the identification of causal effects. We also study the inverse problem, where one starts with a clustered graph and looks for extended graphs where the identifiability properties of causal effects remain unchanged. We show that this kind of structural robustness is closely related to transit clusters.
Objective: In epidemiological follow-up studies, many key covariates, such as smoking, use of medication, blood pressure and cholesterol, are time-varying. Because of practical and financial limitations, time-varying covariates cannot be measured continuously, but only at certain prespecified time points. We study how the number of these longitudinal measurements can be chosen cost-efficiently by evaluating the usefulness of the measurements for risk prediction. Study Design and Setting: The usefulness is addressed by measuring the improvement in model discrimination between models using different amounts of longitudinal information. We use simulated follow-up data and the data from the Finnish East–West study, a follow-up study, with eight longitudinal covariate measurements carried out between 1959 and 1999. Results: In a simulation study, we show how the variability and the hazard ratio of a time-varying covariate are connected to the importance of re-measurements. In the East–West study, it is seen that for older people, the risk predictions obtained using only every other measurement are almost equivalent to the predictions obtained using all eight measurements. Conclusion: Decisions about the study design have significant effects on the costs. The cost-efficiency can be improved by applying the measures of model discrimination to data from previous studies and simulations.
Causal effect identification considers whether an interventional probability distribution can be uniquely determined from a passively observed distribution in a given causal structure. If the generating system induces context-specific independence (CSI) relations, the existing identification procedures and criteria based on do-calculus are inherently incomplete. We show that deciding causal effect non-identifiability is NP-hard in the presence of CSIs. Motivated by this, we design a calculus and an automated search procedure for identifying causal effects in the presence of CSIs. The approach is provably sound and it includes standard do-calculus as a special case. With the approach we can obtain identifying formulas that were unobtainable previously, and demonstrate that a small number of CSI-relations may be sufficient to turn a previously non-identifiable instance to identifiable.
Causal effect identification considers whether an interventional probability distribution can be uniquely determined without parametric assumptions from measured source distributions and structural knowledge on the generating system. While complete graphical criteria and procedures exist for many identification problems, there are still challenging but important extensions that have not been considered in the literature such as combined transportability and selection bias, or multiple sources of selection bias. To tackle these new settings, we present a search algorithm directly over the rules of do-calculus. Due to the generality of do-calculus, the search is capable of taking more advanced datagenerating mechanisms into account along with an arbitrary type of both observational and experimental source distributions. The search is enhanced via a heuristic and search space reduction techniques. The approach, called do-search, is provably sound, and it is complete with respect to identifiability problems that have been shown to be completely characterized by do-calculus. When extended with additional rules, the search is capable of handling missing data problems as well. With the versatile search, we are able to approach new problems for which no other algorithmic solutions exist. We perform a systematic analysis of bivariate missing data problems and study causal inference under case-control design. We also present the R package dosearch that provides an interface for a C++ implementation of the search.
Modern companies regularly use social media to communicate with their customers. In addition to the content, the reach of a social media post may depend on the season, the day of the week, and the time of the day. We consider optimizing the timing of Facebook posts by a large Finnish consumers’ cooperative using historical data on previous posts and their reach. The content and the timing of the posts reflect the marketing strategy of the cooperative. These choices affect the reach of a post via a dynamic process where the reactions of users make the post more visible to others. We describe the causal relations of the social media publishing in the form of a directed acyclic graph, use an identification algorithm to obtain a formula for the causal effect, and finally estimate the required conditional probabilities with Bayesian generalized additive models. As a result, we obtain estimates for the expected reach of a post for alternative timings.
We consider the problem of estimating causal effects of interventions from observational data when well-known back-door and front-door adjustments are not applicable. We show that when an identifiable causal effect is subject to an implicit functional constraint that is not deducible from conditional independence relations, the estimator of the causal effect can exhibit bias in small samples. This bias is related to variables that we call trapdoor variables. We use simulated data to study different strategies to account for trapdoor variables and suggest how the related trapdoor bias might be minimized. The importance of trapdoor variables in causal effect estimation is illustrated with real data from the Life Course 1971–2002 study. Using this data set, we estimate the causal effect of education on income in the Finnish context. Bayesian modelling allows us to take the parameter uncertainty into account and to present the estimated causal effects as posterior distributions.
Repeated covariate measurements bring important information on the time-varying risk factors in long epidemiological follow-up studies. However, due to budget limitations, it may be possible to carry out the repeated measurements only for a subset of the cohort. We study cost-efficient alternatives for the simple random sampling in the selection of the individuals to be remeasured. The proposed selection criteria are based on forms of the D-optimality. The selection methods are compared with the simulation studies and illustrated with the data from the East–West study carried out in Finland from 1959 to 1999. The results indicate that cost savings can be achieved if the selection is focused on the individuals with high expected risk of the event and, on the other hand, on those with extreme covariate values in the previous measurements.
We propose a framework for realistic data generation and the simulation of complex systems and demonstrate its capabilities in a health domain example. The main use cases of the framework are predicting the development of variables of interest, evaluating the impact of interventions and policy decisions, and supporting statistical method development. We present the fundamentals of the framework by using rigorous mathematical definitions. The framework supports calibration to a real population as well as various manipulations and data collection processes. The freely available open-source implementation in R embraces efficient data structures, parallel computing, and fast random number generation, hence ensuring reproducibility and scalability. With the framework, it is possible to run daily-level simulations for populations of millions of individuals for decades of simulated time. An example using the occurrence of stroke, type 2 diabetes, and mortality illustrates the usage of the framework in the Finnish context. In the example, we demonstrate the data collection functionality by studying the impact of nonparticipation on the estimated risk models and interventions related to controlling excessive salt consumption.
Background The decreasing participation rates and selective non-participation peril the representativeness of health examination surveys (HESs). Methods Finnish HESs conducted in 1972–2012 are used to demonstrate that survey participation rates can be enhanced with well-planned recruitment procedures and auxiliary information about survey non-participants can be used to reduce selection bias. Results Experiments incorporated to pilot surveys and experience from previously conducted surveys lead to practical improvements. For example, SMS reminders were taken as a routine procedure to the Finnish HESs after testing their effect on a pilot study and finding them as a cost-effective way to increase participation rate especially among younger age groups. Auxiliary information about survey non-participants can be obtained from many sources: sampling frames, previous measurements in longitudinal setting, re-contacts and non-response questionnaires, and record linkage to administrative data sources. These data can be used in statistical modelling to adjust the population level estimates for the selection bias. Information on the characteristics of non-participants also helps to improve targeting the recruitment in the future. Conclusion All methods discussed and recommended are relatively easy to incorporate to any national HES in Europe except the record linkage of survey data from administrative data sources. This is not feasible in all European countries because of non-existence of registries, lack of an identifier needed for record linkage, or national data protection legislation which restricts the data use.
Data missing not at random (MNAR) is a major challenge in survey sampling. We propose an approach based on registry data to deal with non-ignorable missingness in health examination surveys. The approach relies on follow-up data available from administrative registers several years after the survey. For illustration we use data on smoking prevalence in Finnish National FINRISK study conducted in 1972-1997. The data consist of measured survey information including missingness indicators, register-based background information and register-based time-to-disease survival data. The parameters of missingness mechanism are estimable with these data although the original survey data are MNAR. The underlying data generation process is modelled by a Bayesian model. The results indicate that the estimated smoking prevalence rates in Finland may be significantly affected by missing data.
Summary. Background. Systolic blood pressure, total cholesterol and smoking are known predictors of cardiovascular disease (CVD) mortality. Less is known about the effect of lifetime accumulation and changes of risk factors over time as predictors of CVD mortality, especially in very long follow-up studies. Methods. Data from the Finnish cohorts of the Seven Countries Study were used. The baseline examination was in 1959 and seven re-examinations were carried out approximately in five-year intervals. Cohorts were followed up for mortality until the end of 2011. Time-dependent Cox models with regular time-updated risk factors, time-dependent averages of risk factors and latest changes in risk factors, using smoothing splines to discover nonlinear effects were used to analyse the predictive effect of risk factors for CVD mortality. Results. A model using cumulative risk factors, modelled as the individual-level verages of several risk factor measurements over time, predicted CVD mortality better than a model using the most recent measurement information. This difference seemed to be most prominent for systolic blood pressure. U-shaped effects of the original predictors can be explained by partitioning a risk factor effect between the recent level and the change trajectory. The change in body mass index predicted the risk although body mass index itself did not. Conclusions. The lifetime accumulation of risk factors and the observed changes in risk factor levels over time are strong predictors of CVD mortality. It is important to investigate different ways of using the longitudinal risk factor measurements to take full advantage of them.
Semiparametric inference on average causal effects from observational data is based on assumptions yielding identification of the effects. In practice, several distinct identifying assumptions may be plausible; an analyst has to make a delicate choice between these models. In this paper, we study three identifying assumptions based on the potential outcome framework: the back-door assumption, which uses pre-treatment covariates, the front-door assumption, which uses mediators, and the two-door assumption using pre-treatment covariates and mediators simultaneously. We provide the efficient influence functions and the corresponding semiparametric efficiency bounds that hold under these assumptions, and their combinations. We demonstrate that neither of the identification models provides uniformly the most efficient estimation and give conditions under which some bounds are lower than others. We show when semiparametric estimating equation estimators based on influence functions attain the bounds, and study the robustness of the estimators to misspecification of the nuisance models. The theory is complemented with simulation experiments on the finite sample behavior of the estimators. The results obtained are relevant for an analyst facing a choice between several plausible identifying assumptions and corresponding estimators. Our results show that this choice implies a trade-off between efficiency and robustness to misspecification of the nuisance models.
The mean lifetime is an important characteristic of particles to be identified in nuclear physics. State-of-the-art particle detectors can identify the arrivals of single radioactive nuclei as well as their subsequent radioactive decays (departures). Challenges arise when the arrivals and departures are unmatched and the departures are only partially observed. An inefficient solution is to run experiments where the arrival rate is set very low to allow for the matching of arrivals and departures. We propose an estimation method that works for a wide range of arrival rates. The method combines an initial estimator and a numerical bias correction technique. Simulations and examples based on data on the alpha decays of Lutetium isotope 155 demonstrate that the method produces unbiased estimates regardless of the arrival rate. As a practical benefit, the estimation method enables the use of all data collected in the particle detector, which will lead to more accurate estimates and, in some cases, to shorter experiments.
Aims. We aim to adjust for potential non-participation bias in the prevalence of heavy alcohol consumption. Methods. Population survey data from Finnish health examination surveys conducted in 1987–2007 were linked to the administrative registers for mortality and morbidity follow-up until end of 2014. Utilising these data, available for both participants and non-participants, we model the association between heavy alcohol consumption and alcohol-related disease diagnoses. Results. Our results show that the estimated prevalence of heavy alcohol consumption is on average of 1.5 times higher for men and 1.8 times higher for women than what was obtained from participants only (complete case analysis). The magnitude of the difference in the mean estimates by year varies from 0 to 9 percentage points for men and from 0 to 2 percentage points for women. Conclusion. The proposed approach improves the prevalence estimation but requires follow-up data on non-participants and Bayesian modelling.
Objective One of the main goals of health examination surveys is to provide unbiased estimates of health indicators at the population level. We demonstrate how multiple imputation methods may help to reduce the selection bias if partial data on some nonparticipants are collected. Study Design and Setting In the FINRISK 2007 study, a population-based health study conducted in Finland, a random sample of 10,000 men and women aged 25–74 years were invited to participate. The study included a questionnaire data collection and a health examination. A total of 6,255 individuals participated in the study. Out of 3,745 nonparticipants, 473 returned a simplified questionnaire after a recontact. Both the participants and the nonparticipants were followed up for death and hospitalizations. The follow-up data allowed to check the assumptions on the missing data mechanism, and tailored multiple imputation methods were used to handle the missing data. Results Nonparticipation is a strong predictor for mortality in the five-year follow-up. However, the recontact response does not predict mortality or morbidity among the nonparticipants when adjusted for age and sex. The result suggests that the recontact respondents can be used as proxy for all nonparticipants. A comparison of raw estimates and estimates adjusted for selection bias reveals clear differences in the estimated population prevalences of smoking and heavy alcohol usage. Conclusion All efforts to collect data on nonparticipants are likely to be useful even if the response rate for the recontact remains low. Statistical analysis of the recontact respondents provides an indication of the extent of the selection bias, even in studies where follow-up data are not available to check the assumptions.
Developing environmental conservation plans involves assessing trade-offs between the benefits and costs of conservation. The benefits of conservation can be established with ecological inventories or estimated based on previously collected information. Conducting ecological inventories can be costly, and the additional information may not justify these costs. To clarify the value of these inventories, we investigate the multiple criteria value of information associated with the acquisition of improved ecological data. This information can be useful when informing the decision maker to acquire better information. We extend the concept of the value of information to a multiple criteria perspective. We consider value of information for both monetary and biodiversity criteria and do not assume any fixed budget limits. Two illustrative cases are used describe this method of evaluating the multiple criteria value of information. In the first case, we numerically evaluate the multiple criteria value of information for a single forest stand. In the second case, we present a forest planning case with four stands that describes the complex interactions between the decision maker’s preference information and the potential inventory options available. These example cases highlight the importance of examining the trade-offs when making conservation decisions. We provide a definition for the multiple criteria value of information and demonstrate the potential application when conservation issues conflict with monetary issues.
Objective: To determine the effectiveness of technology-based distance physical rehabilitation interventions in multiple sclerosis (MS) on physical activity and walking. Data sources: A systematic literature search was conducted in seven databases from January 2000 to September 2016. Randomized controlled trials of technology-based distance physical rehabilitation interventions on physical activity and walking outcome measures were included. Methods: Methodological quality of the studies was determined and a meta-analysis was performed. In addition, a subanalysis of technologies and an additional analysis comparing to no treatment were conducted. Results: The meta-analysis consisted of 11 studies. The methodological quality was good (8/13). The Internet, telephone, exergaming, and pedometers were the technologies enabling distance physical rehabilitation. Technology-based distance physical rehabilitation had a large effect on physical activity (standard mean difference (SMD) 0.59; 95% confidence interval (95% CI) 0.38 to 0.79; p < 0.00001) compared to control group with usual care, minimal treatment, and no treatment. A large effect was also observed on physical activity (SMD 0.59; 95% CI 0.34 to 0.83; p < 0.00001) when compared to no treatment alone. There were no differences in walking and the subanalysis of technologies. Conclusions: Technology-based distance physical rehabilitation increased physical activity among persons with MS, but further research on walking in MS is needed. Implications for Rehabilitation Technology-based distance physical rehabilitation interventions increase physical activity among persons with MS. This study was unable to identify if the technologies (Internet, telephone, or combinations) lead to differing effects on physical activity or walking in the distance physical rehabilitation interventions in MS. Further research on the effectiveness of technology-based distance physical rehabilitation interventions on walking in MS is needed.
Objective: We aim to predict the probability of a benefit from two contrasting exercise programs for a woman with a new diagnosis of mild knee osteoarthritis (OA). The short and long-term effects of aquatic resistance training (ART) and high-impact aerobic land training (HLT) compared with the control will be estimated. Methods: Original data sets from two previously conducted randomised controlled trials (RCT) were combined and used in a Bayesian meta-analysis. Group differences in multiple response variables were estimated. Variables included cardiorespiratory fitness, dynamic maximum leg muscle power, maximal isometric knee extension and flexion force, pain, other symptoms and quality of life. The statistical model included a latent commitment variable for each female participant. Results: ART has 55% - 71% probability of benefits in the outcome variables and as the main effect, the intervention outperforms the control in cardiorespiratory fitness with a probability of 71% immediately after the intervention period. HLT has 46% - 63% probability of benefits after intervention with the outcome variables, but differently from ART, the positive effects of physical performance fade away during the follow-up period. Overall, the differences between groups were small and the variation in the predictions between individuals was high. Conclusions: Both interventions had benefits but ART has a slightly higher probability of long-term benefits on physical performance. Because of high individual variation and no clear advantage of one training method over the other, personal preferences should be considered in the selection of the exercise program to ensure highest commitment to training.
Objective: To determine the effectiveness of technology-based distance physical rehabilitation intervention in multiple sclerosis (MS) on physical activity and walking. Data sources: A systematic literature search was conducted in seven databases for January 2000–September 2016. Randomized controlled trials of technology-based distance physical rehabilitation interventions on physical activity and walking outcome measures were included. Methods: Study quality was determined by Furlan (2015) and a meta-analysis was performed. In addition, a subanalysis of technologies and an additional analysis comparing to no treatment were conducted. Results: The meta-analysis consisted of 11 studies. The methodological quality was good (8/13). The Internet, telephone, exergaming and pedometers were the technologies enabling distance physical rehabilitation. Technology-based distance physical rehabilitation had a large effect on physical activity (Standard mean difference (SMD) 0.59; 95% confidence interval (95% CI) 0.38 to 0.79; p<0.00001) compared to control group with usual care, minimal treatment, and no treatment. A large effect was also observed on physical activity (SMD 0.59; 95% CI 0.34 to 0.83; p<0.00001) when compared to no treatment alone. There were no differences in walking and the subanalysis of technologies. Conclusion: Technology-based distance physical rehabilitation increased physical activity among persons with MS, but further research on walking in MS is needed.
Stress tolerance and adaptation to stress are known to facilitate species invasions. Many invasive species are also pests and insecticides are used to control them, which could shape their overall tolerance to stress. It is well-known that heavy insecticide usage leads to selection of resistant genotypes but less is known about potential effects of mild sublethal insecticide usage. We studied whether stressful, sublethal pyrethroid insecticide exposure has within-generational and/or maternal transgenerational effects on fitness-related traits in the Colorado potato beetle (Leptinotarsa decemlineata) and whether maternal insecticide exposure affects insecticide tolerance of offspring. Sublethal insecticide stress exposure had positive within-and transgenerational effects. Insecticide-stressed larvae had higher adult survival and higher adult body mass than those not exposed to stress. Furthermore, offspring whose mothers were exposed to insecticide stress had higher larval and pupal survival and were heavier as adults (only females) than those descending from control mothers. Maternal insecticide stress did not explain differences in lipid content of the offspring. To conclude, stressful insecticide exposure has positive transgenerational fitness effects in the offspring. Therefore, unsuccessful insecticide control of invasive pest species may lead to undesired side effects since survival and higher body mass are known to facilitate population growth and invasion success.
Objective: To determine the effectiveness of technology-based distance interventions for promoting physical activity, using systematic review and metaanalysis. Methods: A literature search of studies published between 2000 and 2015 was conducted in the following databases: CENTRAL, EMBASE, Ovid MEDLINE, CINAHL, PsycINFO, OTseeker, WOS and PEDro. Studies were selected according to the PICOS framework, as follows: P (population): adults; I (intervention): technology-based distance intervention for promoting physical activity; C (comparison) similar distance intervention without technology, O (outcomes) physical activity; S (study design) randomized controlled trial. Physical activity outcomes were extracted and quality was assessed by 2 independent authors. Results: Eight studies were included in the metaanalysis. The mean (standard deviation; range) me thodological quality score of the studies was 6 (1.3; 4–8). Technology-based distance interventions were not more or less effective than conventional treatment whether measured as steps/day (mean difference 1,657; 95% confidence interval (95% CI) –1,861 to 5,176, p=0.18), physical activity min/ week (mean difference 0.34; 95% CI –146.3 to 146.9, p=0.92), or as overall physical activity (response ratio 1.1; 95% CI 0.8–1.4, p=0.65). No associations between the intervention duration or study quality and physical activity outcomes were found. Data were statistically and clinically heterogeneous. Conclusion: The effectiveness of technology-based distance interventions for promoting physical activity is similar to that of conventional treatment.
Background Health status is a principal determinant of labour market participation. In this study, we examined whether excess weight is associated with withdrawal from the labour market owing to premature retirement. Methods The analyses were based on nationally representative data from Finland over the period 2001–15 (N ∼ 2500). The longitudinal data included objective measures of body weight (i.e. body mass index and waist circumference) linked to register-based information on actual retirement age. The association between the body weight measures and premature retirement was modelled using cubic b-splines via logistic regression. The models accounted for other possible risk factors and potential confounders, such as smoking and education. Results Excess weight was associated with an increased risk of premature retirement for both men and women. A closer examination revealed that the probability of retirement varied across the weight distribution and the results differed between sexes and weight measures. Conclusion Body weight outside a recommended range elevates the risk of premature retirement.