Age-Standardized Hospitalization Trends and Spatial Clustering of Head and Neck Malignant Neoplasms (C00–C14, C30–C32) in Chile, 2010–2024

Tendencias de Hospitalización Estandarizadas por Edad y Agrupamiento Espacial de Neoplasias Malignas de Cabeza y Cuello (C00–C14, C30–C32) en Chile, 2010–2024

Author

Amaru Simón Agüero Jiménez

Published

April 6, 2026

NoteResumen

ANÁLISIS EPIDEMIOLÓGICO DE HOSPITALIZACIONES POR NEOPLASIAS MALIGNAS DE CABEZA Y CUELLO EN CHILE, 2010–2024

Introducción: Las neoplasias malignas de cabeza y cuello (CIE-10 C00–C14, C30–C32) constituyen un grupo heterogéneo de tumores con carga epidemiológica relevante a nivel mundial, fuertemente vinculados al consumo de tabaco y alcohol, infección por virus del papiloma humano y exposiciones ocupacionales. En Chile la información poblacional y la distribución geográfica de estas hospitalizaciones han sido escasamente integradas en un análisis nacional.

Materiales y Métodos: Estudio descriptivo ecológico basado en los Egresos Hospitalarios del Departamento de Estadísticas e Información en Salud (DEIS) del Ministerio de Salud de Chile, período 2010–2024. Se incluyeron los egresos con diagnóstico principal C00–C14 y C30–C32 clasificados en 18 categorías anatómicas y agrupados en subsitios clínicos (cavidad oral, glándulas salivales, orofaringe, nasofaringe, hipofaringe, faringe otra, sinonasal y laringe). Se calcularon tasas estandarizadas por edad (TEE) por sexo utilizando la población estándar mundial de la Organización Mundial de la Salud y proyecciones poblacionales del INE. El análisis geoespacial comunal empleó el índice global de Moran y los indicadores locales de asociación espacial (LISA) con matriz de contigüidad Queen sobre Chile continental.

Resultados: Se analizaron 24.575 egresos hospitalarios en el período (16.472 hombres y 8.103 mujeres; razón M:F 2,03; promedio anual de 1.638), siendo el sitio más frecuente la laringe (C32), seguido de cavidad oral. La TEE total mostró una tendencia descendente moderada con marcado predominio masculino, alcanzando en 2024 valores aproximados de 13,9 por 100.000 en hombres y 6,7 en mujeres. El análisis espacial detectó autocorrelación positiva significativa para el total de neoplasias (Moran’s I = 0,070; p = 0,049) y mayor para laringe (I = 0,092; p = 0,010), con 22 comunas clasificadas como conglomerados High–High (hot spots) y 24 como Low–Low (cold spots).

Conclusiones: Las hospitalizaciones por cáncer de cabeza y cuello en Chile muestran un predominio masculino sostenido, concentración en laringe y cavidad oral, y conglomerados espaciales significativos que sugieren focos territoriales con implicaciones para la planificación preventiva y diagnóstica.

Palabras clave: neoplasias de cabeza y cuello; cáncer laríngeo; tasas estandarizadas por edad; análisis espacial; Chile.

NoteAbstract

EPIDEMIOLOGICAL ANALYSIS OF HOSPITALIZATIONS FOR MALIGNANT NEOPLASMS OF THE HEAD AND NECK IN CHILE, 2010–2024

Introduction: Malignant neoplasms of the head and neck (ICD-10 C00–C14, C30–C32) represent a heterogeneous group of tumors with substantial epidemiological burden worldwide, strongly linked to tobacco and alcohol consumption, human papillomavirus infection, and occupational exposures. In Chile, population-level data and the geographic distribution of these hospitalizations have rarely been integrated within a unified national analysis.

Materials and Methods: Descriptive ecological study based on the Hospital Discharge Database (EGRESOS) maintained by the Department of Health Statistics and Information (DEIS), Chilean Ministry of Health, covering 2010–2024. Discharges with primary diagnosis C00–C14 and C30–C32 were classified into 18 anatomical categories grouped into clinical subsites (oral cavity, salivary glands, oropharynx, nasopharynx, hypopharynx, other pharynx, sinonasal, and larynx). Sex-specific age-standardized rates (ASR) were computed using the WHO World Standard Population and Chilean population projections from INE. Municipal-level geospatial analysis employed global Moran’s I and Local Indicators of Spatial Association (LISA) with Queen contiguity weights across continental Chile.

Results: A total of 24,575 hospital discharges were analyzed (16,472 men and 8,103 women; M:F ratio 2.03; mean of 1,638 per year), with larynx (C32) as the most frequent site followed by oral cavity. Total ASR showed a moderate downward trend with sustained male predominance, reaching in 2024 approximately 13.9 per 100,000 in males and 6.7 in females. Spatial analysis revealed significant positive autocorrelation for total head and neck neoplasms (Moran’s I = 0.070; p = 0.049) and stronger clustering for larynx (I = 0.092; p = 0.010), with 22 municipalities classified as High–High clusters (hot spots) and 24 as Low–Low (cold spots).

Conclusions: Hospitalizations for head and neck cancer in Chile show sustained male predominance, concentration in larynx and oral cavity, and significant spatial clusters suggesting territorial foci with implications for preventive and diagnostic planning.

Keywords: head and neck neoplasms; laryngeal cancer; age-standardized rates; spatial analysis; Chile.

1 Introduction

Malignant neoplasms of the head and neck (HNC) represent a heterogeneous group of cancers arising from the oral cavity (C00-C06), salivary glands (C07-C08), oropharynx (C09-C10), nasopharynx (C11), hypopharynx (C12-C13), other parts of lip, oral cavity, and pharynx (C14), nasal cavity and middle ear (C30), accessory sinuses (C31), and larynx (C32). Collectively, HNC malignancies constitute a major cause of cancer morbidity and mortality globally, with substantial variation in incidence and etiology across geographic regions.

Globally, head and neck cancers account for approximately 1.1 million new cases and over 600,000 deaths annually, representing roughly 3% of all cancer incidence worldwide. The epidemiology of HNC is characterized by strong dose-response relationships with tobacco and alcohol consumption, which together account for the majority of cases in high-income countries. In contrast, human papillomavirus (HPV) infection—particularly HPV-16 and HPV-18—drives an increasing proportion of oropharyngeal and nasopharyngeal cancers in developed nations. Nasopharyngeal cancer, while rare in most populations, exhibits exceptionally high incidence in Southeast Asia and North Africa, driven by endemic Epstein-Barr virus (EBV) infection and genetic predisposition. Laryngeal cancer remains one of the most common HNC sites globally and is strongly associated with occupational exposures in certain industries.

In Chile, head and neck cancers represent an important public health concern. The country’s burden of HNC is shaped by the high prevalence of tobacco smoking and alcohol consumption, particularly in lower socioeconomic groups. Limited data on HPV-related cancers suggest emerging trends in oropharyngeal malignancies among younger populations. Geographic variation in HNC incidence across Chilean regions may reflect differences in exposure patterns, healthcare access, and screening practices. These patterns underscore the need for a comprehensive, nationally representative analysis that integrates all C00–C14 and C30–C32 codes within a unified epidemiological framework.

Hospital discharge records (EGRESOS) from the Department of Health Statistics and Information (DEIS) of the Chilean Ministry of Health provide a robust administrative data source capturing essentially all inpatient episodes in the country, covering both the public (FONASA) and, partially, the private (ISAPRE) healthcare sectors. Although these records reflect hospitalizations rather than incident cancer diagnoses, they constitute a valid indicator of disease burden and healthcare utilization at the population level. The continuous availability of EGRESOS data from 2001 onward, combined with census-based population projections from the National Institute of Statistics (INE), enables the computation of age-standardized hospitalization rates comparable across regions, time periods, and demographic subgroups.

This analysis covers the period 2010–2024 and has the following objectives: (1) characterize the hospitalization burden from all malignant neoplasms of the head and neck using age-standardized rates (ASR); (2) identify disparities by sex, anatomical site, and administrative region; (3) evaluate geospatial clustering patterns at the municipal (comuna) level using spatial autocorrelation statistics; and (4) provide a detailed epidemiological profile for each individual ICD-10 category within the C00–C14 and C30–C32 ranges.

2 Methods

2.1 Data Sources

Two main data sources were used. The first is the hospital discharge database (EGRESOS) maintained by the Department of Health Statistics and Information (DEIS) of the Chilean Ministry of Health, covering fiscal years 2010 through 2024. Each record contains the primary discharge diagnosis coded according to the International Classification of Diseases, 10th Revision (ICD-10), along with patient demographic information (sex, age group), geographic identifiers (region and municipality of residence), and administrative variables (year of discharge). EGRESOS captures all discharges from public hospitals in the National Health Service System (SNSS) and, since 2012, progressively includes private institutional discharges, providing near-universal coverage of inpatient episodes in Chile.

The second source is the population projection file produced by the National Institute of Statistics (INE) based on the 2017 National Census. These projections provide annual population estimates disaggregated by region, municipality (comuna), individual age, and sex for the period 2002–2035. The projections were stored in Apache Parquet format and accessed programmatically.

2.2 Case Selection

All hospital discharges whose primary diagnosis (DIAG1 field) corresponded to ICD-10 codes C00 through C14 and C30 through C32 (malignant neoplasms of head and neck organs), including all subcodes, were included in the analysis. Records with missing or invalid diagnosis codes, missing sex, or missing age group were excluded. The eighteen categories encompass: lip (C00), base of tongue (C01), other tongue (C02), gum (C03), floor of mouth (C04), palate (C05), other mouth (C06), parotid gland (C07), other salivary glands (C08), tonsil (C09), oropharynx (C10), nasopharynx (C11), piriform sinus (C12), hypopharynx (C13), other lip/oral/pharynx (C14), nasal cavity and middle ear (C30), accessory sinuses (C31), and larynx (C32).

2.3 Age Standardization

Direct standardization was applied using the WHO World Standard Population, which comprises 17 five-year age groups from 0–4 through 80+. The weights are: 0–4: 8,860; 5–9: 8,690; 10–14: 8,600; 15–19: 8,470; 20–24: 8,220; 25–29: 7,930; 30–34: 7,610; 35–39: 7,150; 40–44: 6,590; 45–49: 6,040; 50–54: 5,370; 55–59: 4,550; 60–64: 3,720; 65–69: 2,960; 70–74: 2,210; 75–79: 1,520; 80+: 1,540 (total weight \(W = \sum w_i = 100{,}000\)).

The population denominator for each year × age group × sex cell was drawn from INE census projections. For each stratum defined by year \(t\), sex \(s\), and age group \(i\), the age-specific hospitalization rate (per 100,000 person-years) is

\[ \text{Rate}_{i,t,s} \;=\; \frac{d_{i,t,s}}{N_{i,t,s}} \times 10^{5}, \]

where:

  • \(\text{Rate}_{i,t,s}\) — hospitalization rate (per 100,000) in age group \(i\), year \(t\), sex \(s\);
  • \(d_{i,t,s}\) — number of HNC hospital discharges in stratum \((i,t,s)\);
  • \(N_{i,t,s}\) — mid-year population at risk in the same stratum (INE projection);
  • \(i \in \{1,\ldots,17\}\) indexes the 5-year age bands \(0{-}4, 5{-}9, \ldots, 80^{+}\).

The age-standardized rate (ASR), using the WHO World Standard Population as the external reference, is the weighted average of the age-specific rates:

\[ \text{ASR}_{t,s} \;=\; \frac{\sum_{i=1}^{17} w_i \,\text{Rate}_{i,t,s}}{\sum_{i=1}^{17} w_i}, \]

where:

  • \(w_i\) — WHO World Standard weight for age group \(i\) (the 17 weights given above; \(\sum_i w_i = W = 100{,}000\));
  • the numerator is the count of cases that would be expected in the standard population had it experienced the age-specific rates observed in stratum \((t,s)\);
  • the denominator \(W\) normalizes the weights, so the ASR is expressed on the same per-100,000 scale as the age-specific rates.

Direct standardization removes the confounding effect of differential age structures across regions and time periods, enabling valid comparisons over time and between geographic units1.

2.3.1 Proportional Redistribution of Age Bands

EGRESOS data from 2010 to 2023 report age in 10-year bands (e.g., “10–19”), but direct standardization with the WHO reference requires 5-year bands. For each 10-year band \(B\) comprising two 5-year subgroups \(b_1\) and \(b_2\), hospitalizations were apportioned in proportion to the population at risk in each subgroup:

\[ d_{b_j} \;=\; d_B \times \frac{N_{b_j}}{N_{b_1} + N_{b_2}}, \qquad j \in \{1, 2\}, \]

where:

  • \(d_B\) — total discharges observed in the 10-year band \(B\);
  • \(N_{b_j}\) — INE projected population in the \(j\)-th 5-year subgroup, for the same year, sex, and geographic unit;
  • \(d_{b_j}\) — imputed number of discharges allocated to subgroup \(b_j\), satisfying \(d_{b_1} + d_{b_2} = d_B\).

This estimator is the maximum-likelihood allocation under the assumption that the age-specific hospitalization rate is approximately uniform within the 10-year band (so \(d_{b_j} / N_{b_j}\) does not vary across the two 5-year cells). It is the standard demographic redistribution device when finer age detail is unavailable. Data from 2024 are already provided in 5-year bands and were used directly without redistribution.

2.4 Geospatial Analysis

2.4.1 Municipal-Level Rate Estimation

Municipal-level (comuna) ASR values were computed by aggregating hospitalizations and population denominators across the entire 2010–2024 period for each municipality, then applying the same direct standardization procedure described above. The resulting rates were joined to a shapefile of Chilean municipalities for cartographic visualization using choropleth maps. Island municipalities (Isla de Pascua, Juan Fernández, and Antarctic territories) were excluded from the spatial analysis due to their geographic isolation and absence of contiguous neighbors.

2.4.2 Spatial Weights Matrix

Spatial relationships between municipalities were formalized through a Queen contiguity weights matrix \(\mathbf{W}\), where two municipalities \(i\) and \(j\) are considered neighbors (\(w_{ij} = 1\)) if they share any portion of their boundary (edge or vertex), and \(w_{ij} = 0\) otherwise, with \(w_{ii} = 0\) by convention. The weights were row-standardized such that \(\sum_j w_{ij} = 1\) for each municipality \(i\), ensuring that the spatial lag of a variable represents the weighted average of its neighbors’ values.

2.4.3 Global Spatial Autocorrelation — Moran’s I

Global spatial autocorrelation of the municipal ASR was assessed with Moran’s I2:

\[ I \;=\; \frac{n}{S_0}\;\frac{\displaystyle\sum_{i=1}^{n}\sum_{j=1}^{n} w_{ij}\,(x_i - \bar{x})(x_j - \bar{x})}{\displaystyle\sum_{i=1}^{n}(x_i - \bar{x})^{2}}, \qquad S_0 \,=\, \sum_{i}\sum_{j} w_{ij}, \]

where:

  • \(n\) — number of spatial units (municipalities);
  • \(x_i\) — municipal ASR for unit \(i\) over the 2010–2024 period;
  • \(\bar{x} = \tfrac{1}{n}\sum_i x_i\) — unweighted mean of the ASR across units;
  • \(w_{ij}\) — element \((i,j)\) of the row-standardized Queen contiguity weights matrix \(\mathbf{W}\);
  • \(S_0\) — sum of all spatial weights (with row standardization, \(S_0 = n\)).

The statistic can be read as a spatial Pearson correlation between \(x_i\) and its spatial lag \(\sum_j w_{ij} x_j\). It ranges from approximately \(-1\) (perfect dispersion / negative autocorrelation) to \(+1\) (perfect clustering), with the expectation under the null of spatial randomness

\[ \mathbb{E}[I] \;=\; -\frac{1}{n-1}. \]

Significance was evaluated with the analytical normal approximation, using the standardized statistic

\[ z \;=\; \frac{I - \mathbb{E}[I]}{\sqrt{\mathrm{Var}(I)}}, \]

compared against \(\mathcal{N}(0,1)\) at \(\alpha = 0.05\).

2.4.4 Local Indicators of Spatial Association (LISA)

To localize clusters and spatial outliers, Local Indicators of Spatial Association (LISA) were computed3. The local Moran statistic for municipality \(i\) is

\[ I_i \;=\; \frac{(x_i - \bar{x})}{\sigma^{2}_{x}} \sum_{j=1}^{n} w_{ij}\,(x_j - \bar{x}), \qquad \sigma^{2}_{x} \,=\, \frac{1}{n}\sum_{i=1}^{n}(x_i - \bar{x})^{2}, \]

where:

  • \(x_i\), \(\bar{x}\), \(w_{ij}\) — defined as in the global Moran’s I above;
  • \(\sigma^{2}_{x}\) — empirical variance of the ASR across the \(n\) municipalities;
  • \(I_i\) — contribution of municipality \(i\) to the global Moran’s I; the global statistic satisfies \(I = \sum_i I_i / (\sum_i w_{i\cdot})\) up to standardization.

Statistical significance of each \(I_i\) was obtained by conditional randomization with 999 permutations: \(x_i\) is held fixed while the remaining \(n-1\) values are randomly permuted across locations, building a reference distribution under the null of spatial randomness. The pseudo \(p\)-value is the proportion of permuted statistics at least as extreme as the observed \(I_i\); significance was set at \(p < 0.05\).

Based on the combination of sign and significance, each municipality was classified into one of five categories: High-High (hot spot: high ASR municipality surrounded by high ASR neighbors), Low-Low (cold spot: low ASR surrounded by low ASR), High-Low (spatial outlier: high ASR surrounded by low ASR), Low-High (spatial outlier: low ASR surrounded by high ASR), or Not Significant.

2.5 Anatomical Classification

ICD-10 codes were grouped into 18 sites and further into 8 clinical subsites: oral cavity (C00-C06: lip, tongue, gum, floor of mouth, palate, other mouth), salivary glands (C07-C08: parotid, other salivary), oropharynx (C09-C10: tonsil, oropharynx), nasopharynx (C11), hypopharynx (C12-C13: piriform sinus, hypopharynx), other pharynx (C14), nasal cavity/sinuses (C30-C31: nasal cavity/middle ear, accessory sinuses), and larynx (C32).

2.6 Software

All analyses were conducted in Python 3.11. Age standardization and data manipulation were performed using pandas (v2.0+) and NumPy. Geospatial operations used GeoPandas for spatial joins and cartographic rendering, libpysal for the construction of spatial weights matrices, and esda for the computation of global Moran’s I and local LISA statistics. Visualization was produced with Matplotlib and Seaborn. The computational document was rendered using Quarto with embedded Python code blocks to ensure full reproducibility.

3 Results

3.1 Age-Standardized Rates (ASR)

3.1.1 ASR by sex and year

Code
asr_by_sex.style.format({
    'hospitalizations': '{:,.0f}',
    'population': '{:,.0f}',
    'crude_rate': '{:.2f}',
    'asr': '{:.2f}'
}).hide(axis="index")
year sex hospitalizations population crude_rate asr
2010 Female 406 8,666,525 4.68 4.01
2010 Male 988 8,397,402 11.77 11.53
2011 Female 433 8,762,836 4.94 4.12
2011 Male 990 8,491,323 11.66 11.22
2012 Female 483 8,858,785 5.45 4.54
2012 Male 1,067 8,584,706 12.43 11.71
2013 Female 553 8,944,258 6.18 5.12
2013 Male 1,106 8,667,644 12.76 11.77
2014 Female 526 9,033,189 5.82 4.79
2014 Male 1,071 8,754,428 12.23 11.04
2015 Female 543 9,125,974 5.95 4.76
2015 Male 1,079 8,845,449 12.20 10.80
2016 Female 520 9,223,665 5.64 4.53
2016 Male 1,091 8,943,482 12.20 10.55
2017 Female 543 9,344,975 5.81 4.56
2017 Male 1,107 9,074,217 12.20 10.40
2018 Female 633 9,506,921 6.66 5.16
2018 Male 1,102 9,244,484 11.92 10.06
2019 Female 586 9,683,077 6.05 4.64
2019 Male 1,092 9,424,139 11.59 9.63
2020 Female 486 9,859,209 4.93 3.76
2020 Male 929 9,599,101 9.68 7.93
2021 Female 576 9,969,851 5.78 4.26
2021 Male 1,055 9,708,512 10.87 8.75
2022 Female 609 10,045,585 6.06 4.53
2022 Male 1,216 9,782,978 12.43 9.78
2023 Female 520 10,112,423 5.14 3.72
2023 Male 1,204 8,613,166 13.98 9.44
2024 Female 686 10,175,877 6.74 4.85
2024 Male 1,375 9,910,500 13.87 10.60
Table 1: Age-standardized hospitalization rates for head and neck neoplasms by sex (per 100,000 pop.)
Code
display_cols = asr_sex_cols.copy()
display_cols.columns = ['Year', 'Hosp. Males', 'Pop. Males', 'Crude Rate M', 'ASR M',
                         'Hosp. Females', 'Pop. Females', 'Crude Rate F', 'ASR F',
                         'Hosp. Total', 'Pop. Total', 'Crude Rate Total', 'ASR Total']

display_cols.style.format({
    'Hosp. Males': '{:,.0f}', 'Pop. Males': '{:,.0f}', 'Crude Rate M': '{:.2f}', 'ASR M': '{:.2f}',
    'Hosp. Females': '{:,.0f}', 'Pop. Females': '{:,.0f}', 'Crude Rate F': '{:.2f}', 'ASR F': '{:.2f}',
    'Hosp. Total': '{:,.0f}', 'Pop. Total': '{:,.0f}', 'Crude Rate Total': '{:.2f}', 'ASR Total': '{:.2f}'
}).hide(axis="index")
Year Hosp. Males Pop. Males Crude Rate M ASR M Hosp. Females Pop. Females Crude Rate F ASR F Hosp. Total Pop. Total Crude Rate Total ASR Total
2010 988 8,397,402 11.77 11.53 406 8,666,525 4.68 4.01 1,394 17,063,927 8.17 7.45
2011 990 8,491,323 11.66 11.22 433 8,762,836 4.94 4.12 1,423 17,254,159 8.25 7.34
2012 1,067 8,584,706 12.43 11.71 483 8,858,785 5.45 4.54 1,550 17,443,491 8.89 7.77
2013 1,106 8,667,644 12.76 11.77 553 8,944,258 6.18 5.12 1,659 17,611,902 9.42 8.15
2014 1,071 8,754,428 12.23 11.04 526 9,033,189 5.82 4.79 1,597 17,787,617 8.98 7.60
2015 1,079 8,845,449 12.20 10.80 543 9,125,974 5.95 4.76 1,622 17,971,423 9.03 7.50
2016 1,091 8,943,482 12.20 10.55 520 9,223,665 5.64 4.53 1,611 18,167,147 8.87 7.25
2017 1,107 9,074,217 12.20 10.40 543 9,344,975 5.81 4.56 1,650 18,419,192 8.96 7.20
2018 1,102 9,244,484 11.92 10.06 633 9,506,921 6.66 5.16 1,735 18,751,405 9.25 7.39
2019 1,092 9,424,139 11.59 9.63 586 9,683,077 6.05 4.64 1,678 19,107,216 8.78 6.92
2020 929 9,599,101 9.68 7.93 486 9,859,209 4.93 3.76 1,415 19,458,310 7.27 5.66
2021 1,055 9,708,512 10.87 8.75 576 9,969,851 5.78 4.26 1,631 19,678,363 8.29 6.34
2022 1,216 9,782,978 12.43 9.78 609 10,045,585 6.06 4.53 1,825 19,828,563 9.20 6.95
2023 1,204 8,613,166 13.98 9.44 520 10,112,423 5.14 3.72 1,724 19,960,889 8.64 6.38
2024 1,375 9,910,500 13.87 10.60 686 10,175,877 6.74 4.85 2,061 20,086,377 10.26 7.52
Table 2: Hospitalizations and ASR by sex — Head and neck neoplasms (columnar format, per 100,000 pop.)

3.1.2 Time series of ASR

Code
fig, axes = plt.subplots(1, 3, figsize=(16, 5))

ax1 = axes[0]
ax1.plot(asr_sex_cols['year'], asr_sex_cols['asr_male'], marker='o', linewidth=2, markersize=7,
         label='Males', color='steelblue')
ax1.plot(asr_sex_cols['year'], asr_sex_cols['asr_female'], marker='s', linewidth=2, markersize=7,
         label='Females', color='coral')
ax1.plot(asr_sex_cols['year'], asr_sex_cols['asr_total'], marker='^', linewidth=2, markersize=7,
         label='Total', color='forestgreen', linestyle='--')
ax1.set_xlabel('Year')
ax1.set_ylabel('ASR (per 100,000)')
ax1.set_title('Age-Standardized Rate', fontweight='bold')
ax1.legend()
ax1.grid(True, alpha=0.3)

ax2 = axes[1]
ax2.plot(asr_sex_cols['year'], asr_sex_cols['crude_rate_male'], marker='o', linewidth=2, markersize=7,
         label='Males', color='steelblue')
ax2.plot(asr_sex_cols['year'], asr_sex_cols['crude_rate_female'], marker='s', linewidth=2, markersize=7,
         label='Females', color='coral')
ax2.plot(asr_sex_cols['year'], asr_sex_cols['crude_rate_total'], marker='^', linewidth=2, markersize=7,
         label='Total', color='forestgreen', linestyle='--')
ax2.set_xlabel('Year')
ax2.set_ylabel('Crude rate (per 100,000)')
ax2.set_title('Crude Rate', fontweight='bold')
ax2.legend()
ax2.grid(True, alpha=0.3)

ax3 = axes[2]
ax3.plot(asr_sex_cols['year'], asr_sex_cols['hosp_male'], marker='o', linewidth=2, markersize=7,
         label='Males', color='steelblue')
ax3.plot(asr_sex_cols['year'], asr_sex_cols['hosp_female'], marker='s', linewidth=2, markersize=7,
         label='Females', color='coral')
ax3.plot(asr_sex_cols['year'], asr_sex_cols['hosp_total'], marker='^', linewidth=2, markersize=7,
         label='Total', color='forestgreen', linestyle='--')
ax3.set_xlabel('Year')
ax3.set_ylabel('Hospitalizations (N)')
ax3.set_title('Total Hospitalizations', fontweight='bold')
ax3.legend()
ax3.grid(True, alpha=0.3)

plt.tight_layout()
save_figure(fig, 'asr_time_series_head_neck.png')
plt.savefig(_FIG / "fig-asr-time-series.png", dpi=300, bbox_inches="tight")
plt.show()
Figure 1: Age-standardized rate trends — Malignant neoplasms of head and neck (C00-C14, C30-C32), 2010–2024

3.2 Analysis by Anatomical Site

3.2.1 ASR by site location

Code
latest_year = asr_by_site['year'].max()
asr_latest = asr_by_site[asr_by_site['year'] == latest_year].copy()
asr_latest['site_name'] = asr_latest['icd_category'].map(SITE_NAMES)

asr_latest[['icd_category', 'site_name', 'hospitalizations', 'asr']].style.format({
    'hospitalizations': '{:,.0f}',
    'asr': '{:.2f}'
}).hide(axis="index")
icd_category site_name hospitalizations asr
C00 Lip (C00) 119 0.39
C01 Base of tongue (C01) 49 0.17
C02 Other tongue (C02) 258 0.96
C03 Gum (C03) 17 0.07
C04 Floor of mouth (C04) 59 0.21
C05 Palate (C05) 68 0.25
C06 Other mouth (C06) 120 0.44
C07 Parotid gland (C07) 171 0.64
C08 Other salivary glands (C08) 77 0.30
C09 Tonsil (C09) 148 0.58
C10 Oropharynx (C10) 144 0.52
C11 Nasopharynx (C11) 88 0.42
C12 Piriform sinus (C12) 5 0.02
C13 Hypopharynx (C13) 62 0.20
C14 Other pharynx (C14) 49 0.18
C30 Nasal cavity/middle ear (C30) 89 0.32
C31 Accessory sinuses (C31) 107 0.40
C32 Larynx (C32) 431 1.44
Table 3: Age-standardized rates by site — Latest available year (per 100,000 pop.)
Code
site_summary = asr_by_site.groupby('icd_category').agg({
    'hospitalizations': 'sum',
    'asr': 'mean'
}).reset_index()
site_summary['asr'] = site_summary['asr'].round(2)
site_summary['site_name'] = site_summary['icd_category'].map(SITE_NAMES)
site_summary = site_summary.sort_values('hospitalizations', ascending=False)

site_summary[['icd_category', 'site_name', 'hospitalizations', 'asr']].style.format({
    'hospitalizations': '{:,.0f}',
    'asr': '{:.2f}'
}).hide(axis="index")
icd_category site_name hospitalizations asr
C32 Larynx (C32) 6,133 1.74
C02 Other tongue (C02) 2,886 0.85
C07 Parotid gland (C07) 1,956 0.58
C09 Tonsil (C09) 1,544 0.45
C10 Oropharynx (C10) 1,464 0.42
C00 Lip (C00) 1,459 0.40
C06 Other mouth (C06) 1,400 0.41
C31 Accessory sinuses (C31) 1,270 0.39
C30 Nasal cavity/middle ear (C30) 959 0.28
C13 Hypopharynx (C13) 909 0.25
C11 Nasopharynx (C11) 826 0.26
C04 Floor of mouth (C04) 783 0.23
C08 Other salivary glands (C08) 732 0.22
C05 Palate (C05) 694 0.20
C14 Other pharynx (C14) 624 0.18
C01 Base of tongue (C01) 587 0.17
C03 Gum (C03) 266 0.08
C12 Piriform sinus (C12) 83 0.03
Table 4: Summary of hospitalizations and mean ASR by site (2010–2024)
Code
fig, axes = plt.subplots(1, 2, figsize=(16, 7))

# Pie chart
ax1 = axes[0]
colors = [SITE_COLORS.get(s, '#cccccc') for s in site_summary['icd_category']]
wedges, texts, autotexts = ax1.pie(
    site_summary['hospitalizations'], labels=site_summary['icd_category'],
    autopct='%1.1f%%', colors=colors, startangle=90)
ax1.set_title('Percentage distribution', fontsize=12, fontweight='bold')
for autotext in autotexts:
    autotext.set_fontsize(8)

# Horizontal bar
ax2 = axes[1]
site_sorted = site_summary.sort_values('hospitalizations', ascending=True)
bar_colors = [SITE_COLORS.get(s, '#cccccc') for s in site_sorted['icd_category']]
ax2.barh(site_sorted['icd_category'], site_sorted['hospitalizations'], color=bar_colors)
ax2.set_xlabel('Total Hospitalizations', fontsize=11, fontweight='bold')
ax2.set_title('Hospitalization burden by site', fontsize=12, fontweight='bold')
ax2.grid(axis='x', alpha=0.3)

plt.tight_layout()
save_figure(fig, 'site_distribution_head_neck.png')
plt.savefig(_FIG / "fig-site-distribution.png", dpi=300, bbox_inches="tight")
plt.show()
Figure 2: Distribution of hospitalizations by site — Head and neck neoplasms (2010–2024)
Code
heatmap_data = asr_by_site.pivot_table(index='icd_category', columns='year', values='asr', aggfunc='first')

fig, ax = plt.subplots(figsize=(16, 10))
sns.heatmap(heatmap_data, annot=True, fmt='.1f', cmap='YlOrRd',
            cbar_kws={'label': 'ASR (per 100,000)'}, ax=ax)
ax.set_title('ASR Heatmap — Head and neck neoplasms (C00-C14, C30-C32) by site and year', fontweight='bold', fontsize=14)
ax.set_xlabel('Year')
ax.set_ylabel('Site')
plt.tight_layout()
save_figure(fig, 'asr_heatmap_site_head_neck.png')
plt.savefig(_FIG / "fig-heatmap-asr-site.png", dpi=300, bbox_inches="tight")
plt.show()
Figure 4: Heatmap of ASR by site and year

3.3 Sex Ratio

Code
sr_display = sex_ratio[['year', 'hosp_male', 'hosp_female', 'sex_ratio_n',
                         'asr_male', 'asr_female', 'sex_ratio_asr']].copy()
sr_display.columns = ['Year', 'Hosp. Males', 'Hosp. Females', 'Ratio M:F (N)',
                       'ASR Males', 'ASR Females', 'Ratio M:F (ASR)']

sr_display.style.format({
    'Hosp. Males': '{:,.0f}',
    'Hosp. Females': '{:,.0f}',
    'Ratio M:F (N)': '{:.2f}',
    'ASR Males': '{:.2f}',
    'ASR Females': '{:.2f}',
    'Ratio M:F (ASR)': '{:.2f}'
}).hide(axis="index")
Year Hosp. Males Hosp. Females Ratio M:F (N) ASR Males ASR Females Ratio M:F (ASR)
2010 988 406 2.43 11.53 4.01 2.88
2011 990 433 2.29 11.22 4.12 2.72
2012 1,067 483 2.21 11.71 4.54 2.58
2013 1,106 553 2.00 11.77 5.12 2.30
2014 1,071 526 2.04 11.04 4.79 2.30
2015 1,079 543 1.99 10.80 4.76 2.27
2016 1,091 520 2.10 10.55 4.53 2.33
2017 1,107 543 2.04 10.40 4.56 2.28
2018 1,102 633 1.74 10.06 5.16 1.95
2019 1,092 586 1.86 9.63 4.64 2.08
2020 929 486 1.91 7.93 3.76 2.11
2021 1,055 576 1.83 8.75 4.26 2.05
2022 1,216 609 2.00 9.78 4.53 2.16
2023 1,204 520 2.32 9.44 3.72 2.54
2024 1,375 686 2.00 10.60 4.85 2.19
Table 5: Male:female ratio by year — Head and neck neoplasms
Code
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

ax1 = axes[0]
ax1.bar(sex_ratio['year'], sex_ratio['sex_ratio_n'], color='purple', alpha=0.7, label='Hospitalizations')
ax1.plot(sex_ratio['year'], sex_ratio['sex_ratio_asr'], marker='o', linewidth=2, color='darkred', label='ASR')
ax1.axhline(y=1, color='gray', linestyle='--', linewidth=2, label='Ratio = 1')
ax1.set_xlabel('Year')
ax1.set_ylabel('Male:Female Ratio')
ax1.set_title('Sex ratio over time', fontweight='bold')
ax1.legend()
ax1.grid(True, alpha=0.3, axis='y')

ax2 = axes[1]
latest = sex_ratio[sex_ratio['year'] == sex_ratio['year'].max()].iloc[0]
categories = ['Males', 'Females']
values = [latest['asr_male'], latest['asr_female']]
bars = ax2.bar(categories, values, color=['steelblue', 'coral'], edgecolor='black', linewidth=1.2)
ax2.set_ylabel('ASR (per 100,000)')
ax2.set_title(f'ASR by sex — Year {int(latest["year"])}', fontweight='bold')
ax2.grid(axis='y', alpha=0.3)
for bar, val in zip(bars, values):
    ax2.text(bar.get_x() + bar.get_width()/2, val + 0.5, f'{val:.1f}',
             ha='center', fontsize=12, fontweight='bold')

plt.tight_layout()
save_figure(fig, 'sex_ratio_head_neck.png')
plt.savefig(_FIG / "fig-sex-ratio.png", dpi=300, bbox_inches="tight")
plt.show()
Figure 5: Analysis of sex ratio in hospitalizations for head and neck neoplasms
Code
# Compute sex-specific ASR by site for the latest year
hosp_sex_site = hosp_5year.groupby(['year', 'age_group', 'sex', 'icd_category']).agg(
    {'hospitalizations': 'sum'}).reset_index()
who_weights = pd.DataFrame([{'age_group': k, 'who_weight': v} for k, v in WHO_STANDARD_POPULATION.items()])
pop_agg = population.copy()

site_sex_list = []
for site in site_summary['icd_category'].tolist():
    for sex in ['Male', 'Female']:
        site_sex_data = hosp_sex_site[
            (hosp_sex_site['year'] == latest_year) &
            (hosp_sex_site['icd_category'] == site) &
            (hosp_sex_site['sex'] == sex)
        ]
        sex_pop = pop_agg[(pop_agg['year'] == latest_year) & (pop_agg['sex'] == sex)]
        merged = site_sex_data.merge(sex_pop[['age_group', 'population']], on='age_group', how='left')
        merged = merged[merged['population'] > 0].dropna(subset=['population'])
        merged['age_specific_rate'] = (merged['hospitalizations'] / merged['population']) * 100000
        merged = merged.merge(who_weights, on='age_group', how='left')
        merged['weighted_rate'] = merged['age_specific_rate'] * merged['who_weight']
        asr_val = merged['weighted_rate'].sum() / TOTAL_WHO_WEIGHT
        site_sex_list.append({'icd_category': site, 'sex': sex, 'asr': round(asr_val, 2)})

site_sex_df = pd.DataFrame(site_sex_list)
site_sex_pivot = site_sex_df.pivot(index='icd_category', columns='sex', values='asr').fillna(0)
site_sex_pivot['ratio'] = (site_sex_pivot['Male'] / site_sex_pivot['Female'].replace(0, np.nan)).round(2)

fig, axes = plt.subplots(1, 2, figsize=(14, 8))

ax1 = axes[0]
x = np.arange(len(site_sex_pivot))
width = 0.35
ax1.barh(x - width/2, site_sex_pivot['Male'], width, label='Males', color='steelblue')
ax1.barh(x + width/2, site_sex_pivot['Female'], width, label='Females', color='coral')
ax1.set_yticks(x)
ax1.set_yticklabels([SITE_NAMES.get(s, s) for s in site_sex_pivot.index], fontsize=9)
ax1.set_xlabel('ASR (per 100,000)')
ax1.set_title(f'ASR by sex and site — {int(latest_year)}', fontweight='bold')
ax1.legend()
ax1.grid(axis='x', alpha=0.3)

ax2 = axes[1]
ratio_sorted = site_sex_pivot.sort_values('ratio', ascending=True)
colors_ratio = ['coral' if r < 1 else 'steelblue' for r in ratio_sorted['ratio']]
ax2.barh([SITE_NAMES.get(s, s) for s in ratio_sorted.index],
         ratio_sorted['ratio'], color=colors_ratio)
ax2.axvline(x=1, color='gray', linestyle='--', linewidth=2)
ax2.set_xlabel('Male:Female ASR Ratio')
ax2.set_title('Sex ratio by site', fontweight='bold')
ax2.grid(axis='x', alpha=0.3)

plt.tight_layout()
save_figure(fig, 'sex_ratio_by_site_head_neck.png')
plt.savefig(_FIG / "fig-sex-ratio-by-site.png", dpi=300, bbox_inches="tight")
plt.show()
Figure 6: Male:female ASR ratio by site — Head and neck neoplasms (latest year)

3.4 Age Distribution

Code
age_hosp = hosp_overall.groupby('age_group')['hospitalizations'].sum()
age_hosp = age_hosp.reindex([a for a in AGE_ORDER if a in age_hosp.index])

age_sex = hosp_overall.groupby(['age_group', 'sex'])['hospitalizations'].sum().unstack(fill_value=0)
age_sex = age_sex.reindex([a for a in AGE_ORDER if a in age_sex.index])

fig, axes = plt.subplots(2, 2, figsize=(16, 10))

# Overall age distribution
ax1 = axes[0, 0]
ax1.bar(age_hosp.index, age_hosp.values, color='teal', alpha=0.7)
ax1.set_xlabel('Age Group')
ax1.set_ylabel('Hospitalizations')
ax1.set_title('Overall distribution', fontweight='bold')
ax1.tick_params(axis='x', rotation=45)
ax1.grid(True, alpha=0.3, axis='y')

# By sex
ax2 = axes[0, 1]
age_sex.plot(kind='bar', ax=ax2, color=['steelblue', 'coral'], width=0.8)
ax2.set_xlabel('Age Group')
ax2.set_ylabel('Hospitalizations')
ax2.set_title('By sex', fontweight='bold')
ax2.tick_params(axis='x', rotation=45)
ax2.legend(title='Sex')
ax2.grid(True, alpha=0.3, axis='y')

# By site and age (top 6 sites)
ax3 = axes[1, 0]
top6 = site_summary.head(6)['icd_category'].tolist()
age_site = hosp_by_site[hosp_by_site['icd_category'].isin(top6)].groupby(
    ['age_group', 'icd_category'])['hospitalizations'].sum().unstack(fill_value=0)
age_site = age_site.reindex([a for a in AGE_ORDER if a in age_site.index])
age_site.plot(kind='bar', ax=ax3, width=0.8, stacked=True,
              color=[SITE_COLORS.get(s, '#ccc') for s in age_site.columns])
ax3.set_xlabel('Age Group')
ax3.set_ylabel('Hospitalizations')
ax3.set_title('By site — top 6 (stacked)', fontweight='bold')
ax3.tick_params(axis='x', rotation=45)
ax3.legend(title='Site', bbox_to_anchor=(1.02, 1), loc='upper left', fontsize=8)
ax3.grid(True, alpha=0.3, axis='y')

# Elderly focus (60+)
ax4 = axes[1, 1]
elderly_groups = ['60-64', '65-69', '70-74', '75-79', '80+']
elderly = hosp_overall[hosp_overall['age_group'].isin(elderly_groups)]
eld_by_sex = elderly.groupby(['age_group', 'sex'])['hospitalizations'].sum().unstack(fill_value=0)
eld_by_sex = eld_by_sex.reindex([a for a in elderly_groups if a in eld_by_sex.index])
eld_by_sex.plot(kind='bar', ax=ax4, color=['steelblue', 'coral'], width=0.8)
ax4.set_xlabel('Age Group')
ax4.set_ylabel('Hospitalizations')
ax4.set_title('Older adult population (≥60 years)', fontweight='bold')
ax4.tick_params(axis='x', rotation=45)
ax4.legend(title='Sex')
ax4.grid(True, alpha=0.3, axis='y')

plt.suptitle('Age patterns — Head and neck neoplasms (C00-C14, C30-C32)', fontsize=14, fontweight='bold', y=1.02)
plt.tight_layout()
save_figure(fig, 'age_distribution_head_neck.png')
plt.savefig(_FIG / "fig-age-distribution.png", dpi=300, bbox_inches="tight")
plt.show()
Figure 7: Age distribution of hospitalizations for head and neck neoplasms
Code
age_summary = hosp_overall.groupby('age_group')['hospitalizations'].sum().reset_index()
total_h = age_summary['hospitalizations'].sum()
age_summary['percentage'] = (age_summary['hospitalizations'] / total_h * 100).round(2)
age_summary['age_order'] = age_summary['age_group'].apply(lambda x: AGE_ORDER.index(x) if x in AGE_ORDER else 99)
age_summary = age_summary.sort_values('age_order').drop(columns='age_order')
age_summary.columns = ['Age Group', 'Hospitalizations', 'Percentage (%)']

age_summary.style.format({
    'Hospitalizations': '{:,.0f}',
    'Percentage (%)': '{:.2f}'
}).hide(axis="index")
Age Group Hospitalizations Percentage (%)
0-4 115 0.47
5-9 108 0.44
10-14 199 0.81
15-19 194 0.79
20-24 334 1.36
25-29 354 1.44
30-34 624 2.54
35-39 587 2.39
40-44 1,220 4.96
45-49 1,246 5.07
50-54 2,692 10.96
55-59 2,464 10.03
60-64 3,823 15.56
65-69 3,039 12.36
70-74 3,001 12.21
75-79 2,178 8.86
80+ 2,397 9.75
Table 6: Summary of hospitalizations by age group — Head and neck neoplasms

3.5 Detailed Analysis by Individual Diagnostic Code

3.5.1 Lip (C00)

Code
fig = plot_site_detail('C00', SITE_COLORS['C00'])
if fig is not None:
    save_figure(fig, 'detail_C00.png')
    plt.savefig(_FIG / "fig-detail-c00.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 8: Detailed analysis — Lip (C00)

3.5.2 Base of tongue (C01)

Code
fig = plot_site_detail('C01', SITE_COLORS['C01'])
if fig is not None:
    save_figure(fig, 'detail_C01.png')
    plt.savefig(_FIG / "fig-detail-c01.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 9: Detailed analysis — Base of tongue (C01)

3.5.3 Other tongue (C02)

Code
fig = plot_site_detail('C02', SITE_COLORS['C02'])
if fig is not None:
    save_figure(fig, 'detail_C02.png')
    plt.savefig(_FIG / "fig-detail-c02.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 10: Detailed analysis — Other tongue (C02)

3.5.4 Gum (C03)

Code
fig = plot_site_detail('C03', SITE_COLORS['C03'])
if fig is not None:
    save_figure(fig, 'detail_C03.png')
    plt.savefig(_FIG / "fig-detail-c03.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 11: Detailed analysis — Gum (C03)

3.5.5 Floor of mouth (C04)

Code
fig = plot_site_detail('C04', SITE_COLORS['C04'])
if fig is not None:
    save_figure(fig, 'detail_C04.png')
    plt.savefig(_FIG / "fig-detail-c04.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 12: Detailed analysis — Floor of mouth (C04)

3.5.6 Palate (C05)

Code
fig = plot_site_detail('C05', SITE_COLORS['C05'])
if fig is not None:
    save_figure(fig, 'detail_C05.png')
    plt.savefig(_FIG / "fig-detail-c05.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 13: Detailed analysis — Palate (C05)

3.5.7 Other mouth (C06)

Code
fig = plot_site_detail('C06', SITE_COLORS['C06'])
if fig is not None:
    save_figure(fig, 'detail_C06.png')
    plt.savefig(_FIG / "fig-detail-c06.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 14: Detailed analysis — Other mouth (C06)

3.5.8 Parotid gland (C07)

Code
fig = plot_site_detail('C07', SITE_COLORS['C07'])
if fig is not None:
    save_figure(fig, 'detail_C07.png')
    plt.savefig(_FIG / "fig-detail-c07.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 15: Detailed analysis — Parotid gland (C07)

3.5.9 Other salivary glands (C08)

Code
fig = plot_site_detail('C08', SITE_COLORS['C08'])
if fig is not None:
    save_figure(fig, 'detail_C08.png')
    plt.savefig(_FIG / "fig-detail-c08.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 16: Detailed analysis — Other salivary glands (C08)

3.5.10 Tonsil (C09)

Code
fig = plot_site_detail('C09', SITE_COLORS['C09'])
if fig is not None:
    save_figure(fig, 'detail_C09.png')
    plt.savefig(_FIG / "fig-detail-c09.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 17: Detailed analysis — Tonsil (C09)

3.5.11 Oropharynx (C10)

Code
fig = plot_site_detail('C10', SITE_COLORS['C10'])
if fig is not None:
    save_figure(fig, 'detail_C10.png')
    plt.savefig(_FIG / "fig-detail-c10.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 18: Detailed analysis — Oropharynx (C10)

3.5.12 Nasopharynx (C11)

Code
fig = plot_site_detail('C11', SITE_COLORS['C11'])
if fig is not None:
    save_figure(fig, 'detail_C11.png')
    plt.savefig(_FIG / "fig-detail-c11.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 19: Detailed analysis — Nasopharynx (C11)

3.5.13 Piriform sinus (C12)

Code
fig = plot_site_detail('C12', SITE_COLORS['C12'])
if fig is not None:
    save_figure(fig, 'detail_C12.png')
    plt.savefig(_FIG / "fig-detail-c12.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 20: Detailed analysis — Piriform sinus (C12)

3.5.14 Hypopharynx (C13)

Code
fig = plot_site_detail('C13', SITE_COLORS['C13'])
if fig is not None:
    save_figure(fig, 'detail_C13.png')
    plt.savefig(_FIG / "fig-detail-c13.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 21: Detailed analysis — Hypopharynx (C13)

3.5.15 Other pharynx (C14)

Code
fig = plot_site_detail('C14', SITE_COLORS['C14'])
if fig is not None:
    save_figure(fig, 'detail_C14.png')
    plt.savefig(_FIG / "fig-detail-c14.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 22: Detailed analysis — Other pharynx (C14)

3.5.16 Nasal cavity and middle ear (C30)

Code
fig = plot_site_detail('C30', SITE_COLORS['C30'])
if fig is not None:
    save_figure(fig, 'detail_C30.png')
    plt.savefig(_FIG / "fig-detail-c30.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 23: Detailed analysis — Nasal cavity and middle ear (C30)

3.5.17 Accessory sinuses (C31)

Code
fig = plot_site_detail('C31', SITE_COLORS['C31'])
if fig is not None:
    save_figure(fig, 'detail_C31.png')
    plt.savefig(_FIG / "fig-detail-c31.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 24: Detailed analysis — Accessory sinuses (C31)

3.5.18 Larynx (C32)

Code
fig = plot_site_detail('C32', SITE_COLORS['C32'])
if fig is not None:
    save_figure(fig, 'detail_C32.png')
    plt.savefig(_FIG / "fig-detail-c32.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 25: Detailed analysis — Larynx (C32)

3.6 Regional Analysis

Code
reg_display = regional_summary.copy()
reg_display.columns = ['Region', 'Hospitalizations', 'Mean population', 'Mean crude rate']

reg_display.style.format({
    'Hospitalizations': '{:,.0f}',
    'Mean population': '{:,.0f}',
    'Mean crude rate': '{:.2f}'
}).hide(axis="index")
Region Hospitalizations Mean population Mean crude rate
Valparaíso 2,731 1,889,531 9.69
Biobío 2,299 1,632,796 9.36
Metropolitana 10,690 7,635,385 9.36
Antofagasta 879 639,145 9.20
Ñuble 693 500,469 9.20
Maule 1,395 1,093,589 8.46
Los Lagos 1,092 868,713 8.37
Aysén 123 105,381 8.32
Magallanes 212 172,147 8.29
Los Ríos 491 398,354 8.20
Coquimbo 968 794,452 8.08
O'Higgins 1,157 955,292 8.07
La Araucanía 1,060 993,796 7.07
Atacama 303 304,917 6.60
Tarapacá 272 349,855 5.15
Arica y Parinacota 159 239,120 4.45
Table 7: Summary of hospitalizations by region — Head and neck neoplasms (2010–2024)
Code
regional_pivot = regional_merged.pivot_table(
    values='crude_rate', index='region', columns='year', aggfunc='first')

if 2024 in regional_pivot.columns:
    regional_pivot = regional_pivot.loc[regional_pivot[2024].sort_values(ascending=False).index]

fig, ax = plt.subplots(figsize=(16, 10))
sns.heatmap(regional_pivot, annot=False, cmap='YlOrRd', ax=ax,
            cbar_kws={'label': 'Crude rate (per 100,000)'})
ax.set_title('Crude hospitalization rates by region and year\nHead and neck neoplasms (C00-C14, C30-C32)',
             fontsize=13, fontweight='bold')
ax.set_xlabel('Year', fontsize=11, fontweight='bold')
ax.set_ylabel('Region', fontsize=11, fontweight='bold')
plt.xticks(rotation=0)
plt.yticks(rotation=0, fontsize=9)
plt.tight_layout()
save_figure(fig, 'regional_heatmap_head_neck.png')
plt.savefig(_FIG / "fig-regional-heatmap.png", dpi=300, bbox_inches="tight")
plt.show()
Figure 26: Heatmap of crude hospitalization rates by region and year — Head and neck neoplasms (2010–2024)
Code
top_regions = regional_summary.head(10)

fig, ax = plt.subplots(figsize=(14, 8))
colors = plt.cm.YlOrRd(np.linspace(0.3, 0.9, len(top_regions)))
bars = ax.barh(top_regions['region'], top_regions['crude_rate'], color=colors)
ax.set_xlabel('Mean crude rate (per 100,000)', fontsize=11, fontweight='bold')
ax.set_title('Top 10 Regions — Head and neck neoplasms (C00-C14, C30-C32), 2010–2024',
             fontsize=12, fontweight='bold')
ax.grid(axis='x', alpha=0.3)
for bar, val in zip(bars, top_regions['crude_rate']):
    ax.text(val + 0.2, bar.get_y() + bar.get_height()/2, f'{val:.1f}',
            va='center', fontsize=10)
plt.tight_layout()
save_figure(fig, 'top_regions_head_neck.png')
plt.savefig(_FIG / "fig-regional-bars.png", dpi=300, bbox_inches="tight")
plt.show()
Figure 27: Top regions by mean crude hospitalization rate — Head and neck neoplasms
Code
top10_regions = regional_summary.head(10)['region'].tolist()
reg_site_data = egresos_df[egresos_df['region'].isin(top10_regions)].groupby(
    ['region', 'icd_category']).size().reset_index(name='count')
reg_site_pivot = reg_site_data.pivot_table(index='region', columns='icd_category', values='count', fill_value=0)
reg_site_pivot = reg_site_pivot.loc[top10_regions]

fig, ax = plt.subplots(figsize=(16, 10))
reg_site_pivot.plot(kind='barh', stacked=True, ax=ax,
                    color=[SITE_COLORS.get(c, '#ccc') for c in reg_site_pivot.columns])
ax.set_xlabel('Hospitalizations', fontsize=11, fontweight='bold')
ax.set_title('Composition by site — Top 10 regions', fontsize=13, fontweight='bold')
ax.legend(title='Site', bbox_to_anchor=(1.02, 1), loc='upper left', fontsize=8)
ax.grid(axis='x', alpha=0.3)
plt.tight_layout()
save_figure(fig, 'regional_site_composition_head_neck.png')
plt.savefig(_FIG / "fig-regional-site-composition.png", dpi=300, bbox_inches="tight")
plt.show()
Figure 28: Regional composition by site — Top 10 regions (2010–2024)

3.7 Geospatial Analysis — Comunal ASR

Code
if not comuna_asr.empty:
    top_comunas = comuna_asr.head(20)[['comuna', 'hospitalizations', 'asr']].copy()
    top_comunas.columns = ['Municipality', 'Total hospitalizations', 'ASR (per 100,000)']
    top_comunas.style.format({
        'Total hospitalizations': '{:,.0f}',
        'ASR (per 100,000)': '{:.2f}'
    }).hide(axis="index")
Table 8: Top 20 municipalities by mean ASR — Head and neck neoplasms

3.7.1 Choropleth Map — Total Head and Neck Neoplasms ASR

Code
if has_spatial_data:
    fig, ax = plt.subplots(figsize=(10, 14))
    gdf_total.plot(column='asr_total', cmap='YlOrRd', linewidth=0.3, ax=ax, edgecolor='0.7',
                   legend=True, legend_kwds={'label': 'ASR (per 100,000)', 'shrink': 0.5})
    ax.set_title('Head and Neck Neoplasms (C00-C14, C30-C32) — ASR by Municipality\nChile Continental, 2010–2024',
                 fontweight='bold', fontsize=12)
    ax.set_axis_off()
    plt.tight_layout()
    save_figure(fig, 'choropleth_asr_total_head_neck.png')
    plt.savefig(_FIG / "fig-choropleth-total.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 29: ASR distribution — All head and neck neoplasms (C00-C14, C30-C32) by municipality, Chile Continental

3.7.2 Choropleth Maps by Individual Neoplasm

Code
if has_spatial_data and len(gdf_by_site) > 0:
    site_cmaps = {
        'C00': 'Reds', 'C01': 'Blues', 'C02': 'Greens', 'C03': 'Purples',
        'C04': 'Oranges', 'C05': 'YlOrBr', 'C06': 'RdPu', 'C07': 'Greys',
        'C08': 'BuGn', 'C09': 'OrRd', 'C10': 'GnBu', 'C11': 'RdPu',
        'C12': 'YlGn', 'C13': 'PuBu', 'C14': 'BuPu', 'C30': 'YlOrRd',
        'C31': 'RdYlGn', 'C32': 'PuRd',
    }

    sites_with_data = [s for s in site_summary['icd_category'].tolist() if s in gdf_by_site]
    n_sites = len(sites_with_data)
    ncols = 3
    nrows = max(1, (n_sites + ncols - 1) // ncols)

    fig, axes = plt.subplots(nrows, ncols, figsize=(18, 6 * nrows))
    axes = axes.flatten() if n_sites > 1 else [axes]

    for idx, site_code in enumerate(sites_with_data):
        ax = axes[idx]
        gdf = gdf_by_site[site_code]
        cmap = site_cmaps.get(site_code, 'viridis')
        gdf.plot(column='asr', cmap=cmap, linewidth=0.2, ax=ax, edgecolor='0.7',
                 legend=True, legend_kwds={'label': 'ASR', 'shrink': 0.4, 'pad': 0.01})
        ax.set_title(f'{SITE_NAMES.get(site_code, site_code)}', fontweight='bold', fontsize=10)
        ax.set_axis_off()

    for idx in range(len(sites_with_data), len(axes)):
        axes[idx].set_visible(False)

    plt.suptitle('Site-Specific ASR Distribution by Municipality — Chile Continental (2010–2024)',
                 fontsize=14, fontweight='bold', y=1.01)
    plt.tight_layout()
    save_figure(fig, 'choropleth_asr_by_site_head_neck.png')
    plt.savefig(_FIG / "fig-choropleth-by-site.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 30: ASR distribution by site — Chile Continental (2010–2024)

3.7.3 Global Moran’s I — Spatial Autocorrelation

Code
moran_results = []

if moran_total is not None:
    moran_results.append({
        'Variable': 'Total C00-C14, C30-C32',
        "Moran's I": round(moran_total.I, 4),
        'E[I]': round(moran_total.EI, 4),
        'Z-score': round(moran_total.z_norm, 4),
        'P-value': round(moran_total.p_norm, 4),
        'Interpretation': 'Clustered' if moran_total.I > 0 and moran_total.p_norm < 0.05 else (
            'Dispersed' if moran_total.I < 0 and moran_total.p_norm < 0.05 else 'Random')
    })

for site_code in moran_by_site:
    m = moran_by_site[site_code]
    moran_results.append({
        'Variable': SITE_NAMES.get(site_code, site_code),
        "Moran's I": round(m.I, 4),
        'E[I]': round(m.EI, 4),
        'Z-score': round(m.z_norm, 4),
        'P-value': round(m.p_norm, 4),
        'Interpretation': 'Clustered' if m.I > 0 and m.p_norm < 0.05 else (
            'Dispersed' if m.I < 0 and m.p_norm < 0.05 else 'Random')
    })

if moran_results:
    moran_df = pd.DataFrame(moran_results)
    display(moran_df.style.format({
        "Moran's I": '{:.4f}', 'E[I]': '{:.4f}', 'Z-score': '{:.4f}', 'P-value': '{:.4f}'
    }).hide(axis="index"))
else:
    display(pd.DataFrame({'Message': ['Insufficient data for Moran analysis']}).style.hide(axis="index"))

# Auto-saved table (tbl-moran-global)
_save_table(moran_df, "tbl-moran-global")
[saved table] tbl-moran-global.csv
Variable Moran's I E[I] Z-score P-value Interpretation
Total C00-C14, C30-C32 0.0700 -0.0029 1.9728 0.0485 Clustered
Larynx (C32) 0.0924 -0.0029 2.5788 0.0099 Clustered
Other tongue (C02) -0.0120 -0.0029 -0.2468 0.8050 Random
Parotid gland (C07) 0.0932 -0.0029 2.6012 0.0093 Clustered
Tonsil (C09) 0.0607 -0.0029 1.7213 0.0852 Random
Oropharynx (C10) 0.1493 -0.0029 4.1185 0.0000 Clustered
Lip (C00) 0.0437 -0.0029 1.2620 0.2069 Random
Other mouth (C06) 0.0199 -0.0029 0.6159 0.5379 Random
Accessory sinuses (C31) 0.0813 -0.0029 2.2782 0.0227 Clustered
Nasal cavity/middle ear (C30) 0.0717 -0.0029 2.0196 0.0434 Clustered
Hypopharynx (C13) 0.1736 -0.0029 4.7759 0.0000 Clustered
Nasopharynx (C11) 0.0097 -0.0029 0.3418 0.7325 Random
Floor of mouth (C04) 0.1026 -0.0029 2.8550 0.0043 Clustered
Other salivary glands (C08) 0.0176 -0.0029 0.5539 0.5797 Random
Palate (C05) -0.0405 -0.0029 -1.0172 0.3091 Random
Other pharynx (C14) -0.0438 -0.0029 -1.1053 0.2690 Random
Base of tongue (C01) 0.0021 -0.0029 0.1362 0.8917 Random
Gum (C03) -0.0018 -0.0029 0.0294 0.9766 Random
Piriform sinus (C12) -0.0347 -0.0029 -0.8590 0.3903 Random
Table 9: Global Moran’s I — Spatial autocorrelation test (Chile Continental)
Code
fig, ax = plt.subplots(figsize=(8, 6))

if moran_total is not None:
    from splot.esda import moran_scatterplot
    moran_scatterplot(moran_total, zstandard=True, ax=ax)
    ax.set_title(f"Total C00-C14, C30-C32 — Moran's I = {moran_total.I:.4f} (p = {moran_total.p_norm:.4f})", fontweight='bold')
else:
    ax.text(0.5, 0.5, 'Insufficient data for Moran analysis', ha='center', va='center', transform=ax.transAxes, fontsize=12)

plt.suptitle("Global Moran's I — Head and Neck Neoplasms (Chile Continental)", fontsize=14, fontweight='bold', y=1.02)
plt.tight_layout()
save_figure(fig, 'moran_scatterplot_head_neck.png')
plt.savefig(_FIG / "fig-moran-scatterplot.png", dpi=300, bbox_inches="tight")
plt.show()
Figure 31: Moran scatterplot — Spatial autocorrelation (Chile Continental)

3.7.4 LISA — Local Indicators of Spatial Association

Code
if has_spatial_data and 'cluster' in gdf_total.columns:
    cl = gdf_total.groupby('cluster').size().reset_index(name='N Comunas')
    cl.columns = ['Cluster', 'N Comunas']
    cl = cl.sort_values('N Comunas', ascending=False)
    display(cl.style.hide(axis="index"))
Cluster N Comunas
Not significant 288
High-High (Hot Spot) 21
Low-Low (Cold Spot) 16
Low-High 14
High-Low 5
Table 10: LISA clusters — Total head and neck neoplasms (C00-C14, C30-C32), Chile Continental
Code
colors_dict = {
    'High-High (Hot Spot)': '#d73027',
    'Low-Low (Cold Spot)': '#4575b4',
    'High-Low': '#fdae61',
    'Low-High': '#abd9e9',
    'Not significant': '#f0f0f0',
    'No data': '#d9d9d9'
}

if has_spatial_data and 'cluster' in gdf_total.columns:
    fig, ax = plt.subplots(figsize=(10, 14))
    gdf_total.plot(ax=ax, color=gdf_total['cluster'].map(colors_dict), edgecolor='gray', linewidth=0.3)
    ax.set_title('LISA Clusters — Total Head and Neck Neoplasms (C00-C14, C30-C32)\nChile Continental', fontsize=12, fontweight='bold')
    ax.set_axis_off()

    legend_elements = [Patch(facecolor=color, edgecolor='black', label=label)
                       for label, color in colors_dict.items() if label != 'No data']
    fig.legend(handles=legend_elements, loc='lower center', ncol=5, fontsize=9,
               frameon=True, bbox_to_anchor=(0.5, 0.02))
    plt.tight_layout()
    save_figure(fig, 'lisa_clusters_total_head_neck.png')
    plt.savefig(_FIG / "fig-lisa-map-total.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 32: LISA cluster map — Total head and neck neoplasms (C00-C14, C30-C32), Chile Continental

3.7.5 LISA Clusters by Individual Neoplasm

Code
sites_with_lisa = [s for s in site_summary['icd_category'].tolist()
                   if s in gdf_by_site and 'cluster' in gdf_by_site[s].columns]

if has_spatial_data and len(sites_with_lisa) > 0:
    n = len(sites_with_lisa)
    ncols = 3
    nrows = max(1, (n + ncols - 1) // ncols)

    fig, axes = plt.subplots(nrows, ncols, figsize=(18, 6 * nrows))
    axes = axes.flatten() if n > 1 else [axes]

    for idx, site_code in enumerate(sites_with_lisa):
        ax = axes[idx]
        gdf = gdf_by_site[site_code]
        gdf.plot(ax=ax, color=gdf['cluster'].map(colors_dict), edgecolor='gray', linewidth=0.2)
        ax.set_title(SITE_NAMES.get(site_code, site_code), fontweight='bold', fontsize=10)
        ax.set_axis_off()

    for idx in range(n, len(axes)):
        axes[idx].set_visible(False)

    legend_elements = [Patch(facecolor=color, edgecolor='black', label=label)
                       for label, color in colors_dict.items() if label != 'No data']
    fig.legend(handles=legend_elements, loc='lower center', ncol=5, fontsize=9,
               frameon=True, bbox_to_anchor=(0.5, 0.01))
    plt.suptitle('LISA Clusters by Site — Chile Continental (2010–2024)',
                 fontsize=14, fontweight='bold', y=1.01)
    plt.tight_layout()
    save_figure(fig, 'lisa_clusters_by_site_head_neck.png')
    plt.savefig(_FIG / "fig-lisa-by-site.png", dpi=300, bbox_inches="tight")
    plt.show()
Figure 33: LISA cluster maps by site — Chile Continental

3.7.6 Spatial Analysis Summary

Code
if has_spatial_data:
    spatial_rows = []
    # Total
    spatial_rows.append({
        'Site': 'Total C00-C14, C30-C32',
        'Comunas with data': int((gdf_total['asr_total'] > 0).sum()),
        "Moran's I": round(moran_total.I, 4) if moran_total else None,
        'P-value': round(moran_total.p_norm, 4) if moran_total else None,
        'Pattern': 'Clustered' if moran_total and moran_total.I > 0 and moran_total.p_norm < 0.05 else 'Random',
        'Hot Spots': int((gdf_total['cluster'] == 'High-High (Hot Spot)').sum()) if 'cluster' in gdf_total.columns else 0,
        'Cold Spots': int((gdf_total['cluster'] == 'Low-Low (Cold Spot)').sum()) if 'cluster' in gdf_total.columns else 0,
    })
    # Per site
    for site_code in site_summary['icd_category'].tolist():
        m = moran_by_site.get(site_code)
        gdf = gdf_by_site.get(site_code)
        has_cluster = gdf is not None and 'cluster' in gdf.columns
        spatial_rows.append({
            'Site': SITE_NAMES.get(site_code, site_code),
            'Comunas with data': int((gdf['asr'] > 0).sum()) if gdf is not None else 0,
            "Moran's I": round(m.I, 4) if m else None,
            'P-value': round(m.p_norm, 4) if m else None,
            'Pattern': 'Clustered' if m and m.I > 0 and m.p_norm < 0.05 else ('Dispersed' if m and m.I < 0 and m.p_norm < 0.05 else 'Random'),
            'Hot Spots': int((gdf['cluster'] == 'High-High (Hot Spot)').sum()) if has_cluster else 0,
            'Cold Spots': int((gdf['cluster'] == 'Low-Low (Cold Spot)').sum()) if has_cluster else 0,
        })

    spatial_df = pd.DataFrame(spatial_rows)
    display(spatial_df.style.format({
        "Moran's I": '{:.4f}', 'P-value': '{:.4f}'
    }).hide(axis="index"))

# Auto-saved table (tbl-spatial-summary)
_save_table(spatial_df, "tbl-spatial-summary")
[saved table] tbl-spatial-summary.csv
Site Comunas with data Moran's I P-value Pattern Hot Spots Cold Spots
Total C00-C14, C30-C32 323 0.0700 0.0485 Clustered 21 16
Larynx (C32) 277 0.0924 0.0099 Clustered 7 23
Other tongue (C02) 230 -0.0120 0.8050 Random 20 25
Parotid gland (C07) 222 0.0932 0.0093 Clustered 11 25
Tonsil (C09) 182 0.0607 0.0852 Random 10 17
Oropharynx (C10) 186 0.1493 0.0000 Clustered 20 20
Lip (C00) 229 0.0437 0.2069 Random 21 13
Other mouth (C06) 210 0.0199 0.5379 Random 13 20
Accessory sinuses (C31) 184 0.0813 0.0227 Clustered 16 21
Nasal cavity/middle ear (C30) 170 0.0717 0.0434 Clustered 8 17
Hypopharynx (C13) 168 0.1736 0.0000 Clustered 16 20
Nasopharynx (C11) 156 0.0097 0.7325 Random 4 10
Floor of mouth (C04) 158 0.1026 0.0043 Clustered 12 42
Other salivary glands (C08) 154 0.0176 0.5797 Random 4 10
Palate (C05) 151 -0.0405 0.3091 Random 3 9
Other pharynx (C14) 174 -0.0438 0.2690 Random 0 38
Base of tongue (C01) 137 0.0021 0.8917 Random 2 12
Gum (C03) 90 -0.0018 0.9766 Random 5 76
Piriform sinus (C12) 44 -0.0347 0.3903 Random 0 6
Table 11: Spatial analysis summary by site

3.8 Summary Dashboard

Code
fig, axes = plt.subplots(2, 3, figsize=(16, 12))

# 1. Total trend
ax1 = axes[0, 0]
yearly_total = hosp_overall.groupby('year')['hospitalizations'].sum()
ax1.plot(yearly_total.index, yearly_total.values, marker='o', linewidth=3, color='darkgreen')
ax1.fill_between(yearly_total.index, yearly_total.values, alpha=0.3, color='green')
ax1.set_xlabel('Year')
ax1.set_ylabel('Hospitalizations')
ax1.set_title('Overall trend', fontweight='bold')
ax1.grid(True, alpha=0.3)

# 2. Site distribution pie
ax2 = axes[0, 1]
site_totals = hosp_by_site.groupby('icd_category')['hospitalizations'].sum().sort_values(ascending=False)
pie_colors = [SITE_COLORS.get(s, '#cccccc') for s in site_totals.index]
ax2.pie(site_totals.values, labels=site_totals.index, autopct='%1.1f%%',
        colors=pie_colors, textprops={'fontsize': 7})
ax2.set_title('Distribution by site', fontweight='bold')

# 3. Sex distribution
ax3 = axes[0, 2]
sex_totals = hosp_overall.groupby('sex')['hospitalizations'].sum()
ax3.bar(sex_totals.index, sex_totals.values, color=['steelblue', 'coral'])
ax3.set_ylabel('Hospitalizations')
ax3.set_title('By sex', fontweight='bold')
ax3.grid(True, alpha=0.3, axis='y')

# 4. Age distribution
ax4 = axes[1, 0]
age_dist = hosp_overall.groupby('age_group')['hospitalizations'].sum()
age_dist = age_dist.reindex([a for a in AGE_ORDER if a in age_dist.index])
ax4.bar(age_dist.index, age_dist.values, color='teal', alpha=0.7)
ax4.set_xlabel('Age Group')
ax4.set_ylabel('Hospitalizations')
ax4.set_title('Age distribution', fontweight='bold')
ax4.tick_params(axis='x', rotation=45)
ax4.grid(True, alpha=0.3, axis='y')

# 5. Top sites bar
ax5 = axes[1, 1]
site_totals_sorted = site_totals.sort_values(ascending=True)
bar_c = [SITE_COLORS.get(s, '#cccccc') for s in site_totals_sorted.index]
ax5.barh(site_totals_sorted.index, site_totals_sorted.values, color=bar_c)
ax5.set_xlabel('Hospitalizations')
ax5.set_title('Burden by site', fontweight='bold')
ax5.grid(True, alpha=0.3, axis='x')

# 6. ASR comparison Male vs Female
ax6 = axes[1, 2]
width = 0.35
x = np.arange(len(asr_sex_cols))
ax6.bar(x - width/2, asr_sex_cols['asr_male'], width, label='Males', color='steelblue')
ax6.bar(x + width/2, asr_sex_cols['asr_female'], width, label='Females', color='coral')
ax6.set_xlabel('Year')
ax6.set_ylabel('ASR (per 100,000)')
ax6.set_title('ASR comparison by sex', fontweight='bold')
ax6.set_xticks(x[::2])
ax6.set_xticklabels(asr_sex_cols['year'].values[::2].astype(int))
ax6.legend()
ax6.grid(True, alpha=0.3, axis='y')

plt.suptitle('Head and Neck Neoplasms (C00-C14, C30-C32) in Chile — Dashboard 2010–2024',
             fontsize=16, fontweight='bold', y=1.02)
plt.tight_layout()
save_figure(fig, 'summary_dashboard_head_neck.png')
plt.savefig(_FIG / "fig-summary-dashboard.png", dpi=300, bbox_inches="tight")
plt.show()
Figure 34: Summary dashboard — Head and neck neoplasms (C00-C14, C30-C32) in Chile, 2010–2024
Code
total_hosp = hosp_overall['hospitalizations'].sum()
n_years = len(YEARS)
sex_totals_df = hosp_overall.groupby('sex')['hospitalizations'].sum()
male_total = sex_totals_df.get('Male', 0)
female_total = sex_totals_df.get('Female', 0)
top_site = hosp_by_site.groupby('icd_category')['hospitalizations'].sum().idxmax()

summary_df = pd.DataFrame({
    'Indicator': [
        'Total hospitalizations',
        'Period analyzed',
        'Anatomical sites evaluated',
        'Hospitalizations males',
        'Hospitalizations females',
        'M:F Ratio',
        'Most frequent site',
        'Mean annual hospitalizations'
    ],
    'Value': [
        f'{total_hosp:,.0f}',
        f'{min(YEARS)}{max(YEARS)} ({n_years} years)',
        f'{hosp_by_site["icd_category"].nunique()} categories',
        f'{male_total:,.0f}',
        f'{female_total:,.0f}',
        f'{male_total/female_total:.2f}' if female_total > 0 else 'N/A',
        f'{top_site} ({SITE_NAMES.get(top_site, top_site)})',
        f'{total_hosp/n_years:,.0f}'
    ]
})

summary_df.style.hide(axis="index")

# Auto-saved table (tbl-summary-stats)
_save_table(summary_df, "tbl-summary-stats")
[saved table] tbl-summary-stats.csv
Table 12: Overall summary statistics — Head and neck neoplasms (C00-C14, C30-C32)

3.9 ICD-10 Code Reference

Code
icd_ref = pd.DataFrame([
    {'ICD-10 Code': k, 'Description': v} for k, v in ICD10_CODES.items()
])

def get_organ_group(code):
    if code.startswith('C00'):
        return 'Lip'
    elif code == 'C01':
        return 'Base of tongue'
    elif code.startswith('C02'):
        return 'Other tongue'
    elif code.startswith('C03'):
        return 'Gum'
    elif code.startswith('C04'):
        return 'Floor of mouth'
    elif code.startswith('C05'):
        return 'Palate'
    elif code.startswith('C06'):
        return 'Other mouth'
    elif code == 'C07':
        return 'Parotid gland'
    elif code.startswith('C08'):
        return 'Other salivary'
    elif code.startswith('C09'):
        return 'Tonsil'
    elif code.startswith('C10'):
        return 'Oropharynx'
    elif code.startswith('C11'):
        return 'Nasopharynx'
    elif code == 'C12':
        return 'Piriform sinus'
    elif code.startswith('C13'):
        return 'Hypopharynx'
    elif code.startswith('C14'):
        return 'Other pharynx'
    elif code.startswith('C30'):
        return 'Nasal/middle ear'
    elif code.startswith('C31'):
        return 'Accessory sinuses'
    elif code.startswith('C32'):
        return 'Larynx'
    return 'Other'

icd_ref['Site'] = icd_ref['ICD-10 Code'].apply(get_organ_group)
icd_ref = icd_ref[['Site', 'ICD-10 Code', 'Description']]

icd_ref.style.hide(axis="index")

# Auto-saved table (tbl-icd-reference)
_save_table(icd_ref, "tbl-icd-reference")
[saved table] tbl-icd-reference.csv
Table 13: ICD-10 codes for malignant neoplasms of head and neck (C00-C14, C30-C32)

References

1. Ahmad, O. B., Boschi-Pinto, C., Lopez, A. D., Murray, C. J. L., Lozano, R., & Inoue, M. (2001). Age standardization of rates: A new WHO standard (GPE Discussion Paper No. 31). World Health Organization.
2. Moran, P. A. P. (1950). Notes on continuous stochastic phenomena. Biometrika, 37(1/2), 17–23. https://doi.org/10.2307/2332142
3. Anselin, L. (1995). Local indicators of spatial association — LISA. Geographical Analysis, 27(2), 93–115. https://doi.org/10.1111/j.1538-4632.1995.tb00338.x