ISSN: 2822-0838 Online

Paternal Genetic Variation of the Sea Nomads from Southern Thailand Revealed by Y Chromosomal Short Tandem Repeats

Pongsapak Jitsuwan, Metawee Srikummool, Jatupol Kampuansai, Kanha Muisuk, Nonglak Prakhun, and Wibhu Kutanan*
Published Date : March 23, 2026
DOI : https://doi.org/10.12982/NLSC.2026.053
Journal Issues : Online First

Abstract The Austronesian-speaking Moken, Moklen, and Urak Lawoi are regarded as the sea nomads of Thailand. Also known as Chao Lay,the Thai sea nomads traditionally inhabit the coastal regions and islands of the Andaman Sea in southern Thailand. Their maritime lifestyle and historical connections with other groups have attracted significant interest in their genetic structure. Our previous mitochondrial DNA study reported that the Moklen and Urak Lawoi, who are more genetically related to each other and closer to Island Southeast Asian groups. Here, we newly genotyped 17 Y-chromosomal short tandem repeat (Y-STR) loci from 169 males of the Moklen and Urak Lawoi, together with other populations from southern and other regions of Thailand. These new data were combined with published Y-STRs from Thai populations to establish a Y-STR dataset for the region. Our findings revealed that the Moken and Moklen are paternally related to each other, in contrast to the maternal side, and both exhibit genetic ties with Mainland Southeast Asian populations. The Urak Lawoi, on the other hand, tell a different story: they are genetically closer to the broader southern Thai populations, Austronesian-speaking groups from Island Southeast Asia, and South Asian populations. Forensic parameters and genetic diversity results indicated that Y-STR datasets for both southern Thai and total Thai populations could be served as a preliminary reference dataset for southern Thai populations. However, the low genetic diversity of some small groups, for example the Moklen, emphasizes the importance of future studies to increase sample numbers that will be more useful for forensic investigations.

 

Keywords: Y-STR, Sea nomads, Paternal lineage, Southern Thailand, Population genetics

 

Funding: This study was supported by Naresuan University (NU) and National Science, Research and Innovation Fund (NSRF) (Grant No: R2568B017). W.K. and M.S. were also partially supported by the Global and Frontier Research University Fund, Naresuan University (Grant number: R2566 C051).

 

Citation:  Jitsuwan, P., Srikummool, M., Kampuansai, J., Muisuk, K., Prakhun, N., and Kutanan, W. 2026. Paternal genetic variation of the sea nomads from Southern Thailand revealed by Y chromosomal short tandem repeats. Natural and Life Sciences Communications. 25(3): e2026053.

 

Graphical Abstract:

 

INTRODUCTION

Southern Thailand, situated on the Malay Peninsula, is bordered by Central Thailand to the north, the Andaman Sea to the west, the Gulf of Thailand to the east, and Malaysia to the south. In 2020, its population was approximately 9.16 million, representing 13.35% of Thailand's total population (Eberhard et al., 2020). The region's demographic landscape is dominated by two main groups: the Tai-Kadai (TK)-speaking southern Thai ("Khon Tai," 66%) and the Austronesian (AN)-speaking southern Thai ("Melayu," 33%). Two minority groups account for the remaining 0.33% of the population: the Austroasiatic (AA)-speaking Maniq, a hunter-gatherer society, and the AN-speaking sea nomads. Known as the "Chao Lay," the sea nomads traditionally inhabit the west coast and number around 12,000 individuals across three distinct groups: the Moken (2,100), Moklen (3,700), and Urak Lawoi (6,200) (Attavanich, 2016). Their approximately 41 communities are geographically distributed, with the Moken near southern Myanmar, the Urak Lawoi near the Malaysian border, and the Moklen in between (Bellina et al., 2021). While the Moken have largely retained their traditional sea-based lifestyle, the Moklen and Urak Lawoi are more settled, working in coastal fisheries, agriculture, and other occupations. Having adopted aspects of mainstream culture, these latter groups often refer to themselves as Thai Mai,or "New Thai" (Thongtaweewiwat, 2006). Many Moklen also refer to themselves as Chao Bok” (coastal people), reflecting a shift away from a nomadic lifestyle and the erosion of their traditional culture (Hope, 2000; Nawichai, 2008). While intermarriage between Chao Lay subgroups was historically occasional, intermarriage with non-Chao Lay groups has become more common recently (Princess Maha Chakri Sirindhorn Anthropology Centre, 2021). The Moklen, in particular, have frequently interacted with and been influenced by other southern Thais (Hoogervorst, 2012). Furthermore, the Moken's traditional boat, the kabang, is distinct from the dredge boats used by the Moklen and Urak Lawoi (Attavanich, 2016). Although all three groups speak languages belonging to the AN family, linguistic studies reveal a closer relationship between Moken and Moklen. These two languages are ultimately related to the Malayic group, either as members or as a parallel branch of western Malayo-Polynesian, and both have been heavily influenced by AA languages (Larish, 2000, 2005). In contrast, the Urak Lawoi language is more distinct, showing a closer relationship to Malay and little influence from AA languages (Steinhauer, 2008; Robert, 2010).

 

Previous genetic studies of populations from southern Thailand have been less frequent than those in other regions of the country. Regarding the majority groups, an autosomal STR study indicated that the Thai-Malay Muslims (AN-speaking group) and Thai Buddhists (TK-speaking group) who lived in the five southernmost Thai provinces showed no significant genetic differences (Kutanan et al., 2014). A more recent study on high-resolution mitochondrial DNA (mtDNA) and Y chromosome sequences revealed that both southern Thai populations (TK and AN) exhibit a close genetic relatedness with AA-speaking Mon and TK-speaking central Thais (Woravatin et al., 2023). This study also found that South Asian (SA) ancestry was one of their shared characteristics, a finding consistent with previous genome-wide data reports (Kutanan et al., 2021).

 

Regarding the sea nomad groups, two previous studies analyzed mtDNA from the hypervariable region 1 (HVR1). One study of Moken from southern Myanmar revealed low maternal genetic diversity, with only two haplotypes (M21 and M46) from 12 samples (Dancause et al., 2009). In contrast, another study of Moken from Chang Island, Ranong Province, identified four mtDNA haplogroups: M21b2, D4e1a, M46a, and F1a1c1 (Seetaraso et al., 2020). A broader analysis of complete mtDNA genomes from Moken, Moklen, and Urak Lawoialong with TK-speaking southern Thai populationsrevealed that the sea nomads exhibit lower genetic diversity compared to the majority southern Thais. Among the sea nomad groups, this study suggested that the Moklen were more closely related to the Urak Lawoi than to the Moken. The predominance of haplogroups D4e1a, E1a1a1a, M21b2, M46a, M50a1, and M71c among the sea nomads supports a Mainland Southeast Asia (MSEA) origin, with evidence of limited maternal genetic interaction with Island Southeast Asia (ISEA) populations (Kutanan et al., 2025). Further supporting this complexity, autosomal STR analysis has revealed distinct genetic patterns, identifying a unique profile in the Urak Lawoi of Satun Province and evidence of admixture in the Moklen of Phuket Province (Srikummool et al., 2022).

 

However, genetic studies on the sea nomad populations in Thailand have largely been limited to mtDNA, and there is a paucity of Y-chromosomal analysis. To date, only one study on Y-chromosomal short tandem repeats (Y-STRs) has been reported in Moken from Ranong Province (Seetaraso et al., 2020). This gap is critical because the Y chromosome offers unique insights. Due to its non-recombining nature and strict patrilineal inheritance, it serves as a powerful marker for tracing paternal ancestry and has become indispensable in population genetics and evolutionary studies (Kampuansai et al., 2020; Lao et al., 2025). Furthermore, Y-STR polymorphisms are increasingly valuable in forensic science. They are particularly useful in cases where autosomal DNA profiling is uninformative, as Y-STR haplotypes can effectively exclude suspects or link male relatives to the same paternal line (Huang et al., 2011). This dual utility in both evolutionary and forensic contexts underscores the need for comprehensive Y-chromosome data for these populations.

 

In this study, we investigated the paternal genetic variation of Thailand's sea nomads by analyzing Y-STRs. We generated new genotypes for several key populations, including the Urak Lawoi and Moklen sea nomads, other southern Thai groups, and comparative populations from northeastern and western Thailand. To establish a comprehensive regional database, we combined our new data with published Y-STR profiles from the Moken, other Thai ethnic groups, and populations across SA, MSEA, ISEA, and East Asia (EA) (Li et al., 2019; Suttijit et al., 2019; Kampuansai et al., 2020; Seetaraso et al., 2020). This allowed us to construct detailed allele frequency databases for both southern Thailand and the broader Thai population. Our results show that the Moken and Moklen are more closely related to each other and genetically distinct from other populations. In contrast, the Urak Lawoi exhibit mixed genetic ancestries, and the Moklen display contrasting paternal and maternal genetic variations.

 

MATERIALS AND METHODS

Samples

We newly generated Y-STR profiles for a total of 169 individuals using DNA samples from previous studies (Srikummool et al., 2022; Kutanan et al., 2025). These individuals belong to seven populations: four from southern Thailand (SouthernThai1, SouthernThai2, Urak Lawoi, and Moklen), two from northeastern Thailand (Khmer and Lao Isan), and one from western Thailand (MonW) (Table 1). In the original studies, donors for buccal swab collection were interviewed to select for healthy volunteers who were unrelated for at least two generations. Following this screening, written informed consent was obtained from all participants. Published Y-STR profiles of 276 individuals from other Thai populationsincluding southern Thai Moken, northern Thai Yong, northeastern Thai Mon (MonNE), and a general Thai group (not specific origin)—were also incorporated from previous studies (Li et al., 2019; Suttijit et al., 2019; Kampuansai et al., 2020; Seetaraso et al., 2020). This brought the total Thai dataset to 445 samples (Table 1). The research protocol involving human subjects was approved by the Khon Kaen University Ethics Committee (Protocol No. HE622223) and the Naresuan University Institutional Review Board (COA No. 250/2023).

 

Table 1. General information of the entire set of 32 populations and genetic diversity and forensic parameter values.

 

Languange family

Country

Geographic location

References

a

b

c

d

SD

MPD

SD

Average GD

SD

HMP

 

SouthernThai1

Tai-Kadai

Thailand

Mainland Southeast Asia

Present Study

31

28

0.903

0.994

0.010

11.402

5.315

0.671

0.348

0.039

 

SouthernThai2

Tai-Kadai

Thailand

Mainland Southeast Asia

Present Study

20

20

1.000

1.000

0.016

12.200

5.755

0.718

0.378

0.050

 

UrakLawoi

Austronesian

Thailand

Mainland Southeast Asia

Present Study

10

10

1.000

1.000

0.045

11.089

5.510

0.652

0.367

0.100

 

Moklen

Austronesian

Thailand

Mainland Southeast Asia

Present Study

13

9

0.692

0.936

0.051

9.949

4.869

0.585

0.322

0.136

 

Moken

Austronesian

Thailand

Mainland Southeast Asia

Seetaraso et al. (2020)

11

9

0.818

0.946

0.066

11.418

5.615

0.672

0.373

0.140

 

Khmer

Austroasiatic

Thailand

Mainland Southeast Asia

Present Study

20

17

0.850

0.979

0.025

8.479

4.094

0.499

0.269

0.070

 

MonNE

Austroasiatic

Thailand

Mainland Southeast Asia

Suttijit et al. (2019)

22

18

0.818

0.983

0.018

10.519

4.985

0.619

0.327

0.062

 

MonW

Austroasiatic

Thailand

Mainland Southeast Asia

Present Study

40

39

0.975

0.999

0.006

12.065

5.571

0.710

0.364

0.026

 

LaoIsan

Tai-Kadai

Thailand

Mainland Southeast Asia

Present Study

35

33

0.943

0.995

0.009

10.215

4.778

0.601

0.312

0.033

 

Yong

Tai-Kadai

Thailand

Mainland Southeast Asia

Kampuansai et al. (2020)

111

100

0.901

0.998

0.002

11.251

5.146

0.662

0.335

0.011

 

Thai

Tai-Kadai

Thailand

Mainland Southeast Asia

Li et al. (2019)

132

131

0.992

0.999

0.001

11.078

5.065

0.652

0.330

0.008

 

Filipino1

Austronesian

The Philipines

Island Southeast Asia

Li et al. (2019)

66

62

0.939

0.997

0.004

10.782

4.969

0.634

0.324

0.018

 

Filipino2

Austronesian

The Philipines

Island Southeast Asia

Hwa et al. (2010)

53

50

0.943

0.998

0.004

10.843

5.011

0.638

0.327

0.021

 

Amis

Austronesian

China (Taiwan)

East Asia

Hwa et al. (2010)

27

26

0.963

0.997

0.011

9.721

4.596

0.572

0.301

0.040

 

Tao

Austronesian

China (Taiwan)

East Asia

Hwa et al. (2010)

42

19

0.452

0.954

0.013

8.792

4.138

0.517

0.270

0.069

 

Indigenous
Taiwanese

Austronesian

China (Taiwan)

East Asia

Hwa et al. (2010)

63

55

0.873

0.995

0.004

8.156

3.834

0.480

0.250

0.021

 

TaiwaneseHan

Sino-Tibetan

China (Taiwan)

East Asia

Hwa et al. (2010)

197

191

0.970

0.999

0.001

10.910

4.982

0.642

0.324

0.005

 

Han

Sino-Tibetan

China (Taiwan)

East Asia

Hwa et al. (2010)

97

97

1.000

1.000

0.002

11.040

5.060

0.649

0.330

0.010

 

MalayChampa

Austronesian

Malaysia

Mainland Southeast Asia

SharifahNany RahayuKarmilla et al. (2018)

12

12

1.000

1.000

0.034

11.848

5.773

0.697

0.382

0.083

 

MalayMinang

Austronesian

Malaysia

Mainland Southeast Asia

SharifahNany RahayuKarmilla et al. (2018)

20

20

1.000

1.000

0.016

11.695

5.530

0.688

0.363

0.050

 

MalayBugis

Austronesian

Malaysia

Mainland Southeast Asia

SharifahNany RahayuKarmilla et al. (2018)

13

13

1.000

1.000

0.030

11.474

5.567

0.675

0.368

0.077

 

MalayKelantan

Austronesian

Malaysia

Mainland Southeast Asia

SharifahNany RahayuKarmilla et al. (2018)

42

42

1.000

1.000

0.005

11.761

5.432

0.692

0.355

0.024

 

MalayJawa

Austronesian

Malaysia

Mainland Southeast Asia

SharifahNany RahayuKarmilla et al. (2018)

13

13

1.000

1.000

0.030

11.487

5.573

0.676

0.368

0.077

 

MalayBanjar

Austronesian

Malaysia

Mainland Southeast Asia

SharifahNany RahayuKarmilla et al. (2018)

17

17

1.000

1.000

0.020

10.551

5.060

0.621

0.333

0.059

 

Kensiu

Austroasiatic

Malaysia

Mainland Southeast Asia

SharifahNany RahayuKarmilla et al. (2018)

10

10

1.000

1.000

0.045

11.089

5.510

0.652

0.367

0.100

 

Batek

Austroasiatic

Malaysia

Mainland Southeast Asia

SharifahNany RahayuKarmilla et al. (2018)

13

13

1.000

1.000

0.030

10.462

5.104

0.615

0.337

0.077

 

Semai

Austroasiatic

Malaysia

Mainland Southeast Asia

SharifahNany RahayuKarmilla et al. (2018)

12

12

1.000

1.000

0.034

9.848

4.852

0.579

0.321

0.083

 

Kedayan

Austronesian

Malaysia

Mainland Southeast Asia

SharifahNany RahayuKarmilla et al. (2018)

128

76

0.594

0.986

0.003

10.207

4.691

0.600

0.306

0.022

 

Jharkhand

N.A.

India

South Asia

Imam et al. (2018)

102

97

0.951

0.999

0.002

10.953

5.021

0.644

0.327

0.011

 

Odisha

Indo-European

India

South Asia

Sahoo et al. (2021)

202

196

0.970

0.999

0.001

11.643

5.296

0.685

0.345

0.005

 

Sheikh

Indo-European

India

South Asia

Aslam et al. (2020)

176

126

0.716

0.989

0.003

10.960

5.006

0.645

0.326

0.016

 

Punjabi

Indo-European

Pakistan

South Asia

Adnan et al. (2018)

94

82

0.872

0.996

0.003

10.660

4.898

0.627

0.319

0.014

 

a No. of Sample

b=   No. of Haplotype

c = Discriminating capacity

d = Haplotype diversity (HD)

Average GD = Average gene diversity

SD = Standard Deviation

MPD = Mean number of pairwise difference

HE  = Expected Heterozygosity

HMP = Haplotype Matching Probability

 

Y-STR typing

DNA concentration was measured using the MaestroNano MN-913 spectrophotometer (Maestrogen, Taiwan) and subsequently diluted to a final concentration of 0.10.5 ng/μl, following the manufacturers genotyping procedure for the AmpFlSTR® Y-filerPCR amplification kit (Applied Biosystems, USA). DNA samples were amplified in a multiplex PCR on a GeneAmpPCR System 9700 (Applied Biosystems, USA), using PCR conditions as specified by the manufacturers instructions. The amplicons were genotyped via multi-capillary electrophoresis on an ABI 3130 Genetic Analyzer (Applied Biosystems) with the GeneScan-500 LIZ size standard and the Y-filer allelic ladder (Applied Biosystems, USA). Finally, Y-STR allele calling was performed using GeneMapper ID-X v.1.4 software (Applied Biosystems, USA).

 

Statistical analyses

Haplotypes were constructed using 17 Y-STR loci: DYS19, DYS385a/b, DYS389 I, DYS389 II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS635, and Y GATA H4. As per previous studies, DYS385a/b were treated as two separate loci (Tian-Xiao et al., 2009). The 445 genotypes of the total Thai dataset were combined with 1,399 published genotypes from comparative populations across MSEA, ISEA, EA, and SA (Table 1). Since this study focuses on the paternal genetic history of southern Thai populations, Y-STR allele frequencies were calculated for both a southern Thai-only dataset (n = 85) and the total Thai dataset (n = 445) using Arlequin v.3.5.1.3 (Excoffier and Lischer, 2010). For the combined dataset of 1,844 Y-STR genotypes from 32 populations, several measures of intra-population genetic diversity were calculated using Arlequin: number of haplotypes, haplotype diversity (HD), gene diversity (GD), and the mean number of pairwise differences (MPD). HD and GD were calculated as  and  , where N is the sample size, pi is the frequency of the ith haplotype, and ai is the frequency of the ith allele. Additionally, we manually calculated the discrimination capacity (DC) as  , where M is the number of different haplotypes and N is the sample size. The haplotype match probability (HMP) was calculated using the formula , where pi is the frequency of the ith haplotype. The matching probability of shared haplotypes between two different populations was calculated as , where pi is the frequency of the ith haplotype in one population and pj is the frequency of the jth haplotype in the other.

 

To investigate population relationship, a genetic distance matrix based on sum of square difference (Rst) was generated by Arlequin, and the matrix was then plotted in three dimensions by means of multidimensional scaling (MDS) using STATISTICA 13.0 (StatSoft, Inc., USA). The heatmap visualization of Rst and MDS values were obtained using R function: ape, pegas, adegenet and ggplot2 packages (R Development Core Team).

 

Analysis of molecular variance (AMOVA) procedure (Excoffier et al., 1992) which used to assess genetic variance at three hierarchical subdivisions (within individuals of a population, among populations within a group, and among groups of populations) (Bertorelle et al., 1996) was performed by Arlequin. In this analysis, the studied populations were grouped based on both geographic and linguistic categories (Table 2) and the statistical significance of each variance component and Rst value was evaluated using 1,000 permutations (Excoffier et al., 1992).

 

Table 2. AMOVA results.

Groups

Number

of groups

Number of populations

Percent variation

Within populations

Between populations within groups

Among groups

Total

1

32

86.14

13.86**

 

Geography (MSEA,ISEA,EA,SA)

4

32

84.5**

8.28**

7.21**

Language (AN,AA,TK,ST,IE and Jharkhand)

5

32

84.76**

6.87**

8.37**

MSEA

1

21

90.59

9.41**

 

ISEA

1

2

100.33

-0.33

 

EA

1

5

86.22

13.78**

 

SA

1

4

94.16

5.84**

 

Thailand

1

11

92.27

7.73**

 

AN

1

15

90.14

9.86**

 

AA

1

6

78.83

21.17**

 

TK

1

5

96.59

3.41**

 

ST

1

2

99.58

0.42

 

IE and Jharkhand

1

4

94.16

5.84**

 

Sea nomads

1

3

87.19

12.81*

 

Semang

1

3

70.07

29.93**

 

SouthernThai

1

5

89.97

10.03**

 

Thailand

1

11

92.27

7.73**

 

Sea nomads vs. AA

2

9

76.72**

18.47**

4.81

Sea nomads vs. AA-no Semang

2

6

78.96**

15.94**

5.10

Sea nomads vs. Semang

2

6

75.49**

20.96**

3.55

Sea nomads vs. AN

2

15

82.71**

7.75**

9.54**

Sea nomads vs. TK

2

8

88.84**

4.14**

7.02*

Sea nomads vs. ST

2

5

90.59**

2.37*

7.04

Sea nomads vs. SA

2

7

78.49**

5.23**

16.28*

Sea nomads vs. Filipino

2

5

87.68**

3.37*

8.95

Sea nomads vs. Taiwanese

2

6

67.93**

9.55**

22.52

Sea nomads vs. Malays

2

9

89.57**

4.75**

5.68*

Sea nomads vs. Southern Thai

2

5

88.28**

6.60**

5.12

* indicates P < 0.05

** indicates P < 0.01

 

 

RESULTS AND DISCUSSION

Genetic diversity of southern Thai and total Thai populations

We newly genotyped 74 males from two southern Thai sea nomad groups (Urak Lawoi and Moklen). For comparison, we also genotyped 55 males from northeastern Thailand (Khmer and Lao Isan) and 40 males from western Thailand (MonW) (Supplementary Table S1). These were combined with 11 previously published Moken Y-STR genotypes (Seetaraso et al., 2020) to create a comprehensive southern Thai dataset of 85 individuals. The allele frequencies and heterozygosity for this dataset are listed in Supplementary Table S2. In this southern Thai dataset (n = 85), we identified 75 unique haplotypes. A total of 100 distinct alleles were observed, with frequencies ranging from 0.0118 to 0.8235 (Supplementary Table S2). Gene diversity (GD) per locus spanned from 0.3014 (DYS391) to 0.8941 (DYS385b), with an average GD of 0.693 ± 0.134. With the exception of DYS391, all loci exhibited GD values greater than 0.5. Haplotype frequencies varied from 0.012 to 0.035, and the overall haplotype match probability (HMP) was 0.015. Out of 75 haplotypes, 67 were unique to a single individual (89.33%). The remaining haplotypes included seven instances of haplotypes found in multiple individuals but within a single population (3 in SouthernThai1, 3 in Moklen, and 1 in Moken) and one haplotype that was shared between the Moklen and Urak Lawoi. The overall haplotype diversity (HD) for the southern Thai populations was 0.997 ± 0.003, and the discrimination capacity (DC) was 0.882.

 

We then combined the southern Thai dataset with 95 newly generated genotypes and 276 previously published genotypes, bringing the total Thai dataset to 445 samples (Table 1). In this total dataset (Supplementary Table S3), 409 different haplotypes were identified, of which 379 (92.67%) were singletons. We also found 25 instances of haplotypes shared within a single population: 3 in SouthernThai1, 3 in Moklen, 1 in Moken, 2 in Khmer, 4 in MonNE, 1 in MonW, 1 in Lao Isan, 9 in Yong, and 1 in the general Thai group. Five haplotypes were shared between different populations: one between Urak Lawoi and Moklen, one between Urak Lawoi and MonNE, one between Lao Isan and Thai, and two between Yong and Thai. For this combined dataset, haplotype frequencies ranged from 0.002 to 0.007, with a haplotype match probability (HMP) of 0.003. The overall haplotype diversity (HD) was 0.9996 ± 0.0002, and the discrimination capacity (DC) was 0.852. As detailed in Supplementary Table S3, a total of 120 alleles were observed, with frequencies ranging from 0.0023 to 0.6944. Gene diversity (GD) per locus varied from 0.4473 (DYS437) to 0.8761 (DYS385b), with an average of 0.672 ± 0.122. All loci exhibited GD values greater than 0.5, with the exceptions of DYS437 and DYS391 (0.4695).

 

When focusing on individual populations within Thailand, the Moklen exhibited the lowest haplotype diversity (HD = 0.936 ± 0.051) and discrimination capacity (DC = 0.692). Meanwhile, the Khmer from northeastern Thailand showed the lowest mean pairwise difference (MPD = 8.478 ± 4.094) and gene diversity (GD = 0.499 ± 0.269). The Moklen also showed the second-lowest MPD and GD values (9.949 ± 4.869 and 0.585 ± 0.322, respectively) among the Thai populations. In contrast, the Moken and Urak Lawoi showed higher genetic diversity than the Moklen. Comparative analysis of genetic diversity revealed that the Moklen and Khmer from Thailand, as well as the Amis, Tao, and Indigenous Taiwanese populations, exhibited lower within-population genetic diversity than other populations such as the Thai, Han, and Malays (Table 1). The reduced diversity in Moklen and Khmer populations, combined with a high proportion of shared internal haplotypes and large genetic distances from other groups (Figures 1 and 2), likely reflects strong population-specific relatedness, small effective population sizes, and the effects of genetic drift. Consequently, the low discrimination capacity of these Y-STR markers in the Moklen (DC = 0.692) indicates a high risk of false or adventitious matches when these individuals are evaluated against a pooled Thai database. Therefore, forensic analyses should prioritize the use of population-specific allele frequencies or employ higher-resolution Y-STR panels to reduce the risk of incorrectly matching unrelated males from these endogamous populations. This scenario of reduced diversity is analogous to patterns observed in other isolated, patrilineal groups, such as the Austronesian-speaking Tao and Aboriginal Taiwanese, who have also experienced significant genetic drift (Hwa et al., 2010; Trejaut et al., 2014).

 

 

Figure 1. Heat plot of pairwise genetic distance (Rst) based on Y-STRs haplotypes between 32 populations. The “ = ”symbol indicates Rst values that are not significantly different from zero (P > 0.05).

 

Figure 2. The three-dimensional MDS plots of the 32 populations for dimensional 1 vs. dimensional 2 (A), dimensional 1 vs. dimensional 3 (B) and dimensional 2 vs. dimensional 3 (C) and the heat plot of standardized values of MDS with three dimensions (D).

 

Genetic variation of Thai and other populations

To quantify the structuring of paternal genetic variation, we performed an AMOVA on a dataset of 32 populations from EA, MSEA, ISEA, and SA. The results indicate that the variation among populations accounts for 13.86% of the total genetic variance (Table 2). The EA group shows the largest among-population variation (13.78%P < 0.01), followed by MSEA (9.41%, P < 0.01) while ISEA (both Filipino groups) showed genetic similarity (-0.33, P > 0.05). Within EA group, the AA group shows the greatest genetic heterogeneity among populations (21.17%, P < 0.01), followed by the AN (9.86%, P < 0.01) and TK groups (3.41%, P < 0.01); the Sino-Tibetan (ST) group shows the lowest among-population variation (0.42%, P > 0.05). The SA groups showed moderated among-population variation (5.84%, P < 0.01). The greatest heterogeneity among populations is exhibited by Semang from Malaysia (29.93%P < 0.01) that is about three time greater than Thailand group (7.73%, P < 0.01) while the group of sea nomads has slightly higher variation (12.81%, P < 0.01) than the average of Southern Thai group (10.03%, P < 0.01).

 

A direct comparison of sea nomads vs. other groups showed lower genetic difference between sea nomads vs. Semang (3.55%, P > 0.05), sea nomads vs. AA (4.81%, P > 0.05), sea nomads vs. AA-no Semang (5.11%, P > 0.05) and sea nomads vs. southern Thai groups (5.12%, P > 0.05) than other groups, i.e. SA, AN, TK, ST, Filipino, Taiwanese and Malays (Table 2), suggesting a closer genetic relationship between sea nomads and populations from southern Thailand and AA-speaking groups from MSEA. Overall, our results indicate Thai sea nomad populations exhibited more genetic connections to MSEA populations, inconsistent with previous mtDNA results (Kutanan et al., 2025).

 

Genetic relationships between the Thai sea nomads and other populations

The AMOVA analysis, which grouped all sea nomad populations together, revealed significant among-population variation (12.81%, P < 0.01), reflecting genetic heterogeneity within these groups. This finding is consistent with the genetic distance results. Pairwise Rst values indicated a non-significant genetic difference between the Moklen and Moken. Furthermore, both the Moklen and Moken showed a close affinity to the MalayChampa (Figure 1). In contrast, the Urak Lawoi showed closer genetic relatedness to several other populationsincluding southern Thais, Mon from Western Thailand, various AN-speaking groups from Malaysia and the Philippines, and SA populationsthan they did to the Moken and Moklen. Overall, while the Moken and Moklen are genetically close to each other, they are distinct from most other populations. The Urak Lawoi, conversely, are genetically distinct from the Moken and Moklen and show more extensive connections to neighboring mainland and island groups.

 

To visualize inter-population relationships, a MDS analysis based on pairwise Rst values was performed. In dimension 1 of the MDS plot, AA-speaking populations from Northeastern Thailand (Mon and Khmer) and Malaysia (Semai) formed a cluster on the right side, with the Khmer positioned most distantly from other groups. Conversely, the Urak Lawoi and southern Thai groups clustered on the left side together with populations from Malaysia, the Philippines, and Taiwan, indicating genetic affinity among AN-speaking populations (Figure 2A). The major Thai populations were intermediately distributed between these two sides. Dimension 2 separated the SA groups, which clustered in the lower portion of the plot. Notably, the Mon, Khmer, southern Thais, Urak Lawoi, and some Malay groups shifted towards the SA cluster, reflecting a degree of South Asian genetic influence in these populations (Figure 2B). Dimension 3 highlighted the genetic distinctiveness of several groups. The Moken and Moklen formed a unique cluster in the lower part of the plot, separate from all other populations. Additionally, the AN-speaking Tao and indigenous Taiwanese, along with the AA-speaking Semai and Khmer, were positioned at the margins of the plot, underscoring their genetic divergence (Figure 2C). Finally, a heatmap of the MDS coordinates confirmed the genetic divergence between the Urak Lawoi and the Moken-Moklen cluster (Figure 2D).

 

Overall, our findings demonstrate genetic heterogeneity among the sea nomad populations. The Moken and Moklen show greater genetic affinity to non-AN speaking groups, while the Urak Lawoi are genetically closer to AN-speaking groups. Our Y-STR results align closely with the linguistic evidence, which separates Moken/Moklen from Urak Lawoi, but they contrast with some cultural markers (such as lifestyle and boat type) that group the Moklen and Urak Lawoi together

 

Previous autosomal STR studies reported the genetic differentiation of the Urak Lawoi from the Moken and Moklen, as well as evidence of mixed ancestry in the Moklen (Srikummool et al., 2022). In contrast, mtDNA analyses indicated the maternal distinctiveness of the Moken, while the Moklen and Urak Lawoi showed closer maternal relatedness and shared evidence of contact with ISEA groups, such as the prevalence of haplogroup E1a1a1 (Kutanan et al., 2025). Our study thus reveals contrasting paternal and maternal genetic ancestries for the Moklen. This genetic complexity is mirrored by their evolving cultural identity. The mixed ancestry of the Moklen revealed by autosomal STRs, combined with their sharing of mtDNA haplogroup E1a1a1 with the Urak Lawoi and other ISEA groups, points to admixture between the Moklen and other populations in the female lineage but not in the paternal lineage. Our results reflect a scenario where Moklen males share a common ancestry with the Moken, likely originating from MSEA, whereas their female lineages derive from more diverse sources due to population contact. Interestingly, our Y-STR data also show that the Moken and Moklen have less SA ancestry than the Urak Lawoi. This contrasts with mtDNA results (Kutanan et al., 2025) and may be a consequence of genetic drift or the small male sample sizes in these groups.

 

Although the main focus of this study is on the sea nomad groups, our analysis also yielded interesting results for the two geographically distant Southern Thai populations. The Southern Thai1 group is from the Tak Bai district of Narathiwat Province (lower southern Thailand), while the Southern Thai2 group is from Phuket and Nakhon Si Thammarat Provinces (situated further north). Despite their geographic separation, both groups showed a similar paternal genetic structure. They exhibit genetic affinities to AA-speaking Mon from western Thailand, AN-speaking groups from Malaysia, and SA groups, reflecting a highly admixed population structure. The SA genetic influence has been previously reported in Khmer, Mon, Central Thai, and other Southern Thai populations (Kutanan et al., 2021, 2025; Woravatin et al., 2023). Our study expands on these findings, providing a clearer picture of SA-related ancestry in an additional Southern Thai population from Tak Bai, Narathiwat, confirming that this influence is geographically widespread across the region. Interestingly, we also observed genetic divergence of the Northeastern Thai Khmer from other populations, in agreement with previous mtDNA and autosomal STR studies (Boonsoda et al., 2013; Chantakot et al., 2017). The lower MPD values (8.478 ± 4.094) and gene diversity (GD = 0.499 ± 0.269) compared to other Thai populations (Table 1) reflect strong population-specific relatedness, small effective population sizes, and the effects of genetic drift.

 

CONCLUSION

In this study, we genotyped and analyzed forensic Y-STRs in several key Thai populations, including AN-speaking sea nomads, TK-speaking Southern Thais, AA-speaking Mon, and Khmer and TK-speaking LaoIsan groups. We generated Y-STR allele frequency data for the Southern Thai and a total Thai dataset, which will serve as a valuable preliminary reference database for future forensic investigations. From an anthropological perspective, our findings reveal a close genetic similarity between the Moken and Moklen. Both are genetically distinct from the Urak Lawoi, who show a closer relationship to the majority of Southern Thai populations. The contrasting genetic variations observed in the Moklen suggest different genetic histories of male and female lineages, and we propose that Moklen males descend from a common Moken-Moklen ancestral population originating in MSEA. We acknowledge several limitations in this study, primarily the small sample sizes for the sea nomad populations and the limited sampling across the full extent of their communities in Southern Thailand. Additionally, direct comparison between Y-STR data and mtDNA sequences may introduce some bias due to the different mutation rates and genetic variation of these marker types. Future studies employing whole Y-chromosome sequencing that can define precise haplogroups and genome-wide data, coupled with a more comprehensive sampling of sea nomad communities, will be essential for a complete reconstruction of the complex genetic history of this region. Similarly, increasing the number of samples and Y-STR loci using different panels can establish a more robust database for future forensic applications.

 

ACKNOWLEDGEMENTS

We thank coordinators who assisted with sample collection, and we thank all participants who donated their biological samples

 

AUTHOR CONTRIBUTIONS

Pongsapak Jitsuwan: Formal Analysis (Equal), Writing Original Draft (Lead), Writing Review & Editing (Equal), Investigation (Equal); Metawee Srikummool: Methodology (Supporting), Writing Review & Editing (Equal), Investigation (Supportive); Jatupol Kampuansai: Methodology (Supporting), Writing Review & Editing (Equal), Investigation (Supporting); Kanha Muisuk: Methodology (Equal); Nonglak Prakhun: Methodology (Equal), Formal Analysis (Supporting)Wibhu Kutanan: Conceptualization (Lead), Methodology (Equal), Formal Analysis (Equal), Writing Original Draft (Lead), Writing Review & Editing (Lead), Investigation (Equal), Supervision (Lead), Project Administration (Lead).

 

CONFLICT OF INTEREST

The authors have no conflicts of interest to declare that are relevant to the content of this article.

 

REFERENCES

Adnan, A., Rakha, A., Noor, A., van Oven, M., Ralf, A., and Kayser, M. 2018. Population data of 17 Y-STRs (Yfiler) from Punjabis and Kashmiris of Pakistan. International Journal of Legal Medicine. 132(1): 137-138. https://doi.org/10.1007/s00414-017-1611-9

 

Aslam, M.A., Hussain, M., Khan, K., Zahra, F.T., Shafique, M., and Javeed, S. 2020. Haplotype diversity of 17 Y-STRs in Sheikh population of Punjab. International Journal of Legal Medicine. 134(4): 1325-1326. https://doi.org/10.1007/s00414-019-02202-1

 

Attavanich, M. 2016. A study of living conditions in post-tsunami houses: The case of the Moklen ethnic minority in Phang Nga Province, Southern Thailand. Ph.D. Dissertation. Kyoto University, Kyoto, Japan.

 

Bellina, B., Blench, R., and Galipaud, J.-C. 2021. Sea Nomadism from the past to the present. p. 1–27. In B. Bellina, R. Blench and J.-C. Galipaud (Eds.), Sea nomads of Southeast Asia: From the past to the present. NUS Press, Singapore. https://doi.org/10.2307/j.ctv2gjx12g.5

 

Bertorelle, G., Calafell, F., Francalacci, P., Bertranpetit, J., and Barbujani, G. 1996. Geographic homogeneity and non-equilibrium patterns of mtDNA sequences in Tuscany, Italy. Human Genetics. 98(2): 145-150. https://doi.org/10.1007/s004390050178

 

Boonsoda P., Srithawong S., Srikuka S., and Kutanan W. 2013. Mitochondrial DNA variation of the Khmer in Surin Province, Thailand. Thai Journal of Genetics. 6: 40-48 (in Thai).

 

Chantakot, P., Pittayaporn, P., Srithongdaeng, K., Srithawong, S., and Kutanan, K. 2017. Genetic divergence of austroasiatic speakingmgroups in the Northeast of Thailand: A case study on northern Khmer and Kuy. Chaing Mai Journal of Science. 44(4): 1279-1294.

 

Dancause, K.N., Chan, C.W., Arunotai, N.H., and Lum, J.K. 2009. Origins of the Moken Sea Gypsies inferred from mitochondrial hypervariable region and whole genome sequences. Journal of Human Genetics. 54(2): 86-93. https://doi.org/10.1038/jhg.2008.12

 

Eberhard, D.M., Simons, G.F., and Fennig, C.D. 2020. Ethnologue: Languages of the world (23rd ed.). SIL International, Texas.

 

Excoffier, L. and Lischer, H.E.L. 2010. Arlequin suite ver 3.5: A new series of programs to perform population genetics analyses under Linux and Windows. Molecular Ecology Resources. 10(3): 564-567.

 

Excoffier, L., Smouse, P.E., and Quattro, J.M. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes: Application to human mitochondrial DNA restriction data. Genetics. 131(2): 479-491. https://doi.org/10.1093/genetics/131.2.479

 

Hoogervorst, T.G. 2012. Ethnicity and aquatic lifestyles: Exploring Southeast Asia's past and present seascapes. Water History. 4(3): 245-265. https://doi.org/10.1007/s12685-012-0060-0

 

Hope, S. 2000. Outcasts of the islands: The sea gypsies of South East Asia. Harper Collins Publishers, New York.

 

Huang, D., Shi, S., Zhu, C., Yi, S., Ma, W., Wang, H., and Li, H. 2011. Y-haplotype screening of local patrilineages followed by autosomal STR typing can detect likely perpetrators in some populations. Journal of Forensic Sciences. 56(5): 1340-1342. https://doi.org/10.1111/j.1556-4029.2011.01830.x

 

Hwa, H.L., Tseng, L.H., Ko, T.M., Chang, Y.Y., Yin, H.Y., Su, Y.N., and Lee, J.C. 2010. Seventeen Y-chromosomal short tandem repeat haplotypes in seven groups of population living in Taiwan. International Journal of Legal Medicine. 124(4): 295-300. https://doi.org/10.1007/s00414-010-0425-9

 

Imam, J., Reyaz, R., Singh, R.S., Bapuly, A.K., and Shrivastava, P. 2018. Genomic portrait of population of Jharkhand, India, drawn with 15 autosomal STRs and 17 Y-STRs. International Journal of Legal Medicine. 132(1): 139-140. https://doi.org/10.1007/s00414-017-1610-x

 

Kampuansai, J., Kutanan, W., Dudás, E., Vágó-Zalán, A., Galambos, A., and Pamjav, H. 2020. Paternal genetic history of the Yong population in Northern Thailand revealed by Y-chromosomal haplotypes and haplogroups. Molecular Genetics and Genomics. 295(3): 579-589. https://doi.org/10.1007/s00438-019-01644-x

 

Kutanan, W., Kitpipit, T., Phetpeng, S., and Thanakiatkrai, P. 2014. Forensic STR loci reveal common genetic ancestry of the Thai-Malay Muslims and Thai Buddhists in the deep Southern region of Thailand. Journal of Human Genetics. 59(12): 675-681. https://doi.org/10.1038/jhg.2014.93

 

Kutanan, W., Liu, D., Kampuansai, J., Srikummool, M., Srithawong, S., Shoocongdej, R., Sangkhano, S., Ruangchai, S., Pittayaporn, P., Arias, L., et al. 2021. Reconstructing the human genetic history of mainland Southeast Asia: Insights from genome-wide data from Thailand and Laos. Molecular Biology and Evolution. 38(8): 3459-3477. https://doi.org/10.1093/molbev/msab124

 

Kutanan, W., Woravatin, W., Srikummool, M., Suwannapoom, C., Hübner, A., Kampuansai, J., Khaokiew, C., Schaschl, H., Översti, S., La, D.D., et al. 2025. Maternal genetic origin of Chao Lay coastal maritime populations from Thailand. BMC Biology. 23: 146. https://doi.org/10.1186/s12915-025-02252-5

 

Larish, M.D. 2000. The position of Moken and Moklen within the Austronesian language family (Thailand). University Microfilms.

 

Larish, M.D. 2005. Moken and Moklen. p. 513–533. In S. Adelaar and N. Himmelmann (Eds.) The Austronesian languages of Asia and Madagascar.  Routledge, London.

 

Li, L., Xu, Y., Luis, J.R., Alfonso-Sanchez, M.A., Zeng, Z., Garcia-Bertrand, R., and Herrera, R.J. 2019. Cebú Thailand and Taiwanese aboriginal populations according to Y-STR loci. Gene: X. 1: 100001.

 

Nawichai, P. 2008. Ethnic group livelihood strategies and state integration: Moken and the hill people in negotiation with the state. Ph.D. Dissertation. University of Passau, Passau, Germany.

 

Rahayu, S.K.S., Aedrianee, A.R., Haslindawaty, A.R., Azeelah, A.N., Panneerchelvam, S., Norazmi, M.N., and Zafarina, Z. 2018. Paternal lineage affinity of the Malay subethnic and Orang Asli populations in Peninsular Malaysia. International Journal of Legal Medicine. 132(4): 1087-1090. https://doi.org/10.1007/s00414-017-1697-0

 

Robert, C.S. 2010. Changes in patterns of communication of the Urak Lawoi’. Ph.D. Dissertation. University of Wisconsin, Wisconsin, USA.

 

Sahoo, S., Samal, R., Behera, S., Biswas, S., Dixit, S., Kumawat, R.K., Chaubey, G., Bhasney, V., and Shrivastava, P. 2021. Genomic insight into Y-STR diversity in the population of Odisha, India. International Journal of Legal Medicine. 135(5): 1771-1772. https://doi.org/10.1007/s00414-021-02545-8

 

Seetaraso, T., Kutanan, W., Kampuansai, J., Muisuk, K., and Srikummool, M. 2020. Unique genetic structure of Y-chromosomal lineage of the Moken from the Andaman Sea of Thailand. Chiang Mai University Journal of Natural Sciences. 19(4): 1066-1079.

 

Srikummool, M., Srithawong, S., Muisuk, K., Sangkhano, S., Suwannapoom, C., Kampuansai, J., and Kutanan, W. 2022. Forensic and genetic characterizations of diverse southern Thai populations based on 15 autosomal STRs. Scientific Reports. 12(1): 655. https://doi.org/10.1038/s41598-021-04646-1

 

Steinhauer, H. 2008. On the development of Urak Lawoi' Malay. Wacana, Journal of the Humanities of Indonesia. 10(1): 117-143. https://doi.org/10.17510/wjhi.v10i1.181

 

Suttijit, S. Muisuk, K, Srithawong, S., Kampuansai, J., Srikammool, M., and Kutanan W. 2019. Different genetic structure between male and female of the Mon from Nakhon Ratchasima Province. P.1-10.  In: Boonpakdee, C., Sukparangsri, W., Pongprayoon, W and Udomprasert A. (eds) Proceeding of 21st National Genetic Conference. Chonburi, Thailand, 20-22 June 2019. The Zign Hotel.

 

Thongtaweewiwat, K. 2006. Community rights and interest of Thai society: A case study of Moklen land ownership after the Tsunami in the Baan Tungwah, Khukkak Sub-district, Takuapa District, Phang Nga. Master's Dissertation. Chulalongkorn University, Thailand.

 

Tian-Xiao, Z., Li, Y., and Sheng-Bin, L. 2009. Y-STR haplotypes and the genetic structure from eight Chinese ethnic populations. Legal Medicine. 11: S198-S200.

 

Trejaut, J.A., Poloni, E.S., Yen, J.C., Lai, Y.H., Loo, J.H., Lee, C.L., He, C.L., and Lin, M. 2014. Taiwan Y-chromosomal DNA variation and its relationship with Island Southeast Asia. BMC Genetics. 15: 77. https://doi.org/10.1186/1471-2156-15-77

 

Woravatin, W., Stoneking, M., Srikummool, M., Kampuansai, J., Arias, L., and Kutanan, W. 2023. South Asian maternal and paternal lineages in southern Thailand and the role of sex-biased admixture. PLoS One. 18(9): e0291547. https://doi.org/10.1371/journal.pone.0291547

 

OPEN access freely available online

Natural and Life Sciences Communications

Chiang Mai University, Thailand. https://cmuj.cmu.ac.th

 

Supplementary

Table S1. Raw Y-STRs genotyping of newly generated samples.

 

Table S2. Allelic frequency and gene diversity value of southern Thai dataset.

 

Table S3. Allelic frequency and gene diversity value of southern Thai dataset.

 

Table S4. Number of shared haplotypes between populations (below the diagonal) and matching probability (above the diagonal).

 

Table S5. The Rst (below the diagonal) and P values (above the diagonal) among the entire dataset of 32 populations.

 

Pongsapak Jitsuwan1, Metawee Srikummool2, Jatupol Kampuansai3, Kanha Muisuk4, Nonglak Prakhun1, and Wibhu Kutanan1, *

 

1 Department of Biology, Faculty of Science, Naresuan University, Phitsanulok 65000, Thailand.

2 Department of Biochemistry, Faculty of Medical Science, Naresuan University, Phitsanulok 65000, Thailand.

3 Department of Biology, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand.

4 Faculty of Science, Naresuan University, Phitsanulok 65000, Thailand.

 

Corresponding author: Wibhu Kutanan, E-mail: wibhuk@nu.ac.th

 

ORCID iD:

Pongsapak Jitsuwan: https://orcid.org/0009-0009-7112-6405

Metawee Srikummool:  https://orcid.org/0000-0002-3457-2723

Jatupol Kampuansai: https://orcid.org/0000-0003-4687-104X

Kanha Muisuk: https://orcid.org/0009-0006-8043-9321

Nonglak Prakhun: https://orcid.org/0009-0000-5631-4736

Wibhu Kutanan: https://orcid.org/0000-0001-7767-1644

 


Total Article Views


 Editor: Sirasit Srinuanpan,

Chiang Mai University, Thailand

 

Article history:

Received: October 24, 2025;

Revised:  February 1, 2026;

Accepted: February 13, 2026;

Online First: March 23, 2026