> For the complete documentation index, see [llms.txt](https://jonas.gitbook.io/seismo_rain/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://jonas.gitbook.io/seismo_rain/cross-comparison-between-seismic-and-rainfall-data/establishing-a-correlation-between-displacement-amplitude-and-rainfall-intensity.md).

# Establishing a correlation between displacement amplitude and rainfall intensity

Having extracted the processed seismic data from all four stations and obtained the rainfall intensity at each seismic station from CMORPH. We aim to explore the correlation between the maximum absolute amplitude of displacement at each 30-minute interval and the rainfall intensity. Previous findings have indicated a robust relationship between Seismic Power (P) (bandpass filtered between 80-120 Hz) and rainfall intensity (I) with a relationship expressed as P ∝ $$I^{1.9}$$ ([Bakker et al., 2022](https://doi.org/10.1016/j.jhydrol.2022.127812)).

## Calculate the maximum absolute displacement from the waveforms

```python
# Define the length of the data
data_length = len(st[0])

# Define the number of segments (in 30-minute intervals)
num_segments = 8

# Calculate the size of each segment
segment_size = data_length // num_segments

# Create lists to store maximum values arrays for each seismic station
max_values_arrays = []

# Loop over each seismic station
for station_data in st:
    # Create an array to store maximum values for the current seismic station
    max_values = np.zeros(num_segments)
    
    # Loop through the segments
    for i in range(num_segments):
        # Calculate start and end indices for the current segment
        x = i * segment_size
        y = (i + 1) * segment_size
        
        # Calculate the maximum value for the current segment
        max_values[i] = np.max(np.abs(station_data[x:y]))
    
    # Append the array to the list of maximum values arrays
    max_values_arrays.append(max_values)

# Convert the list of arrays to a 1D array
max_values_arrays = np.array(max_values_arrays).flatten()

# Print the maximum values array
print(max_values_arrays)
```

The resulting data maintains the original order of seismic stations and timestamps, corresponding to the timing of the rainfall data.

```
[1.28638882e-07 1.73963539e-07 3.41861054e-07 1.39142704e-07
 6.71864096e-08 4.59934489e-08 5.43080290e-08 6.07141831e-08
 1.69014273e-08 5.40966746e-08 1.14213725e-07 1.16206957e-07
 4.59686219e-08 2.46533107e-08 7.87711692e-09 1.23106982e-08
 1.39437923e-07 4.85595590e-08 1.03382752e-07 2.64644315e-07
 2.68364114e-07 2.16924390e-07 2.78347212e-08 7.48340521e-09
 4.01597823e-08 9.63271772e-08 7.42827166e-08 1.44401169e-07
 1.56135115e-07 6.37538969e-08 2.96375077e-08 3.29787613e-09]
```

## Making a 1D array for rainfall intensity data

Now, having already processed the [rainfall data](/seismo_rain/rainfall-data/extracting-rainfall-data.md#opening-precipitation-netcdf-files) for the seismic station's location, we can stack them together to create a 1D array.

```python
prep = np.array([prep_1517,prep_199,prep_465,prep_763]).flatten()
print(prep)
```

Output:

```
array([1.22      , 1.5899999 , 1.4599999 , 2.2       , 1.5899999 ,
       0.        , 0.24      , 0.84999996, 1.1       , 0.72999996,
       1.4599999 , 2.07      , 2.2       , 0.        , 0.24      ,
       0.84999996, 0.        , 0.61      , 1.4599999 , 1.8199999 ,
       3.3999999 , 2.4299998 , 0.24      , 0.48999998, 0.        ,
       0.61      , 0.24      , 1.5799999 , 2.4299998 , 2.4299998 ,
       1.4599999 , 0.48999998], dtype=float32)
```

## Correlate between maximum absolute displacement and rainfall intensity

We'll proceed by creating a scatter plot that compares the maximum absolute displacement with the precipitation data. Then, we'll conduct statistical analysis to determine the correlation coefficient ($$r^{2}$$) and p-value between these variables.

```python
from scipy.stats import linregress

# Perform linear regression
slope, intercept, r_value, p_value, std_err = linregress(prep, max_values_arrays)

# Calculate R-squared
r_squared = r_value**2

print("R-squared:", r_squared)
print("p-value:", p_value)

# Plot the data and linear regression line
plt.scatter(prep, max_values_arrays, label='Data')
plt.plot(prep, intercept + slope*prep, color='red', label='Linear Regression')
plt.xlabel('Rainfall Intensity (mm/hour)')
plt.ylabel('Maximum Absolute Displacement (m)')

plt.legend()
plt.grid(True)
plt.show()
```

Output:

```
R-squared: 0.3419
p-value: 0.0004
```

<figure><img src="/files/yIZrllbiBvQUtfagM6SL" alt=""><figcaption><p><strong>The correlation between maximum absolute displacement and rainfall intensity</strong></p></figcaption></figure>

The correlation between maximum absolute displacement and rainfall intensity appears weak due to a large variance. However, a statistically significant positive trend (p < 0.05) is observed. This suggests a direct proportionality between rainfall intensity, raindrop size, the kinetic energy of raindrops hitting the ground, and maximum displacement ([Rosewell, 1986](https://journals.ametsoc.org/view/journals/apme/25/11/1520-0450_1986_025_1695_rkeiea_2_0_co_2.xml); [Bakker et al., 2022](https://doi.org/10.1016/j.jhydrol.2022.127812)). Yet, establishing a robust mathematical relationship requires a more extensive dataset.
