[vsnet-chat 7802] Re: Period Analysis using the Lasso
Ivan Andronov
tt_ari at ukr.net
Wed May 23 16:38:41 JST 2012
A very interesting paper. Thank You.
Small remarks. In my programs for periodogram analysis, we use a full version of the smoothing function
x(t)=a_0+a_1*(t-tm)+a_2*(t-tm)^2+...+a_p*(t-tm)^p+
(C_1*cos(w*t)+S_1*sin(w*t))+....(C_s*cos(s*w*t)+S_s*sin(s*w*t))
i.e. a multi-harmonic wave (algebraic (ordinary) polynomial of order "p" in the first line)
superimposed on the trigonometric polynomial of order "s" (second line).
For each trial frequency, the coefficients are determined using the least squares method.
For the simplest case p=0, s=1,
x(t)=a_0+(C_1*cos(w*t)+S_1*sin(w*t))
with 3(!) parameters a_0, C_1, S_1, instead of two C_1, S_1,
x(t)-xm=(C_1*cos(w*t)+S_1*sin(w*t))
as generally the sample mean xm does not coincide with the best fit parameter a_0
(when the observations are irregularly distributed over phases).
Generally, the trend a_0+a_1*(t-tm)+a_2*(t-tm)^2+...+a_p*(t-tm)^p
determined using (1+p+s) equations is different from that
obtained using preliminary "prewhitening" (i.e. solving (1+p) normal equations).
These mathematical details may influence on the parameters of the fit
and thus on physical conclusions.
To determine statistically optimal values of "p" and "s", we use three criteria:
1 the Fischer's (ANOVA) which often gives larger number of parameters,
as often the main assumption on uncorrelated normally distributed errors is not
very accurate for measurements, especially the photometric ones with
additional types of variability and thus _correlation_ between close residuals.
2 The "minimal r.m.s. accuracy estimate of the fitting function at the arguments of observations" sigma[X_c]
(sometimes changed for "minimal r.m.s. accuracy estimate of the fitting function"
at the given argument (phase)
or for the timing of extremum
3 Maximal "signal to noise, i.e. ratio of the r.m.s. value of the deviation of the smoothing function from the mean value
to sigma[X_c].
For noisy data, criteria "2","3" typically propose smaller number of parameters than "1", which tends to approximate smaller fluctuations.
In more detail, one may read my invited reviews
(Multi-)frequency variations of stars. Some methods and results
http://cdsads.u-strasbg.fr/abs/1994OAP.....7...49A
(direct link http://il-a.pochta.ru/oap7_049.pdf )
Multiperiodic versus noise variations: mathematical methods
2003ASPC..292..391A
Advanced Methods for Determination of Arguments of Characteristic Events
2005ASPC..335...37A
(and for searching for periodicity in a presence of other period)
On Hour-Scale Photometric Variations of TT Arietis
1992IBVS.3763....1T
These basic programs and their improvements have been applied
to 1400+ objects from variable stars of different types to AGN.
Because of wide variety of types of variability,
we use a set of complementary methods, which will give the same result
for a test sinusoid "observed" regularly for an exact number of periods.
Otherwise we prefer to use a complete orthogonalization
of basic functions instead of using abbreviated formulae.
> To those who are interested in period analysis.
> It's now time to start learning R -- it's free and powerful
> in data analysis. R has been already widely used and discussed in
> Japanese variable star community (particularly amateurs), and we've found
> it very suited for medium-sized data analysis (e.g. arXiv:1111.4286),
> time-series analysis, one-dimensional spectroscopic analysis etc.
>
> http://arxiv.org/abs/1205.4791
>
> ===
>
> Title: Period Analysis using the Least Absolute Shrinkage and Selection
> Operator (Lasso)
> Authors: Taichi Kato (Kyoto U) and Makoto Uemura (Hiroshima U)
> Categories: astro-ph.IM
> Comments: 9 pages, 13 figures, accepted for publication in PASJ
> License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
> \\
> We introduced least absolute shrinkage and selection operator (lasso) in
> obtaining periodic signals in unevenly spaced time-series data. A very simple
> formulation with a combination of a large set of sine and cosine functions has
> been shown to yield a very robust estimate, and the peaks in the resultant
> power spectra were very sharp. We studied the response of lasso to low
> signal-to-noise data, asymmetric signals and very closely separated multiple
> signals. When the length of the observation is sufficiently long, all of them
> were not serious obstacles to lasso. We analyzed the 100-year visual
> observations of delta Cep, and obtained a very accurate period of 5.366326(16)
> d. The error in period estimation was several times smaller than in Phase
> Dispersion Minimization. We also modeled the historical data of R Sct, and
> obtained a reasonable fit to the data. The model, however, lost its predictive
> ability after the end of the interval used for modeling, which is probably a
> result of chaotic nature of the pulsations of this star. We also provide a
> sample R code for making this analysis.
More information about the vsnet-chat
mailing list