Data is currently at**https://data.giss.nasa.gov/gistemp/tabledata_v4/GLB.Ts+dSST.csv**

or

**https://data.giss.nasa.gov/gistemp/tabledata_v4/GLB.Ts+dSST.txt**

(or such updated location for this Gistemp v4 LOTI data)

January 2024 might show as 122 in hundredths of a degree C, this is +1.22C above the 1951-1980 base period. If it shows as 1.22 then it is in degrees i.e. 1.22C. Same logic/interpretation as this will be applied.

If the version or base period changes then I will consult with traders over what is best way for any such change to have least effect on betting positions or consider N/A if it is unclear what the sensible least effect resolution should be.

Numbers expected to be displayed to hundredth of a degree. The extra digit used here is to ensure understanding that +1.20C does not resolve an exceed 1.205C option as yes.

## Related questions

@parhizj

August 2024 was the joint-warmest August globally (together with August 2023), with an average ERA5 surface air temperature of 16.82°C, 0.71°C above the 1991-2020 average for August.

If that is tied to nearest hundredth, why are you expecting LOTI to be 10 hundredths warmer?

(Assume I am being thick and not following what your graphs and explanations show.)

@ChristopherRandles Referring to the graphs below based on ERA5 data? The last prediction I made before ERSST data came out was 1.294 C. Look at the error from last August 2023 (orange dot) using that method (ERA5->GISTEMP) -- it was ~ 0.1C cooler than expected, but the range of values is also large (~0.1C). The error for the linear model I used for August (to try to correct any bias between ERA5 and GISTEMP) is also ~0.1C. I hope this explains why I expected it to be ~1.29; in other words, I interpret it as an unknown mix in this model of either the GISTEMP value from last year was underestimating or ERA5 overestimating the value from August 2023.

@ChristopherRandles That actually implies similar anomalies. My ERA5 model median was 127 for GISTEMP based on a near tie in ERA5. Why? On same base period, that is actually a 131 anomaly. We expect the differential between GISS and ERA5 to not be as large anymore (119 vs 131), as it was last year (due to the El Nino).

In any case, we now have GHCN and ERSST data which settles it.

@aenews Or to put it another way, the differential between ERA5 and GISTEMP was anomalously high last year (owing to quirks of hot and cold patches instigated by El Nino). So they should be close to in line now that we are back to ENSO Neutral.

@aenews Yeah. Remember I posted the link to the repo in last months market. It requires using an old version of their own ERSST for the correct masking by month and sub box though.

Polymarket has 97% chance record warm

https://polymarket.com/event/2024-august-hottest-on-record?tid=1725282541031

No ersst data yet... seems to be pure modeling... Don't know how to reduce my error bars to make them tighter given my simple model....

```
GIS TEMP anomaly projection (August 2024) (corrected, assuming -0.014 error, (absolute_corrected_era5: 16.805)):
1.294 C +-0.098
```

To be fair to the 97% on poly, I do get a really high prediction.... I'd have to reduce it the prediction a bit more for it to not be in the 1 sigma margin of error I get...

@parhizj I think ppl might be creating „fake“ ERSST & GHCNM files using other datasets. Then you can run the code as normal. That presumably gives you a decent bit more accuracy. Also, in my experience, the OISST dataset correlates better with ERSST than ERA5 does

Changed my betting based on a metaprediction for rest of month's predicted temps still being cooler than actual (i.e. meaning I need to offset temps for remaining days at least +0.1C still):

With it I get much less conservative (center) predictions but the lower end isn't improved too much...

Large amount of gefs-era5 prediction error the last two days in the opposite direction of the metaprediction (about 0.1 degrees cooler than predicted) causing me to pick a more neutral offset for the remaining days to update upon:

Only the highest bin suggests some additional correction might be needed...

Edit: Polymarket suggests ~50% or lower and that's higher than even my more conservative estimates.

For August I don't have a large correction going from ERA5 to gistemp error in my model as August is one of those months that is not fitted per month like most of the other months. This seems to account for the moment for the largest amount of uncertainty (as opposed to the era5-gefs prediction error).

Right now I have with a neutral ERA5-GEFS offset (and a ERA5-gistemp correction of -0.014) a gistemp prediction of:

`1.282 C +-0.098`

On the other hand the correction I used for July was shown to be actually too large (I had predicted ~1.18 or so if I recall and it was 1.21 so the correction I made of about -0.06 which was about twice as much. Here I have a correction of only -0.014 for August; even if I guess a correction of -0.03 that would bring that 1.282 down to 1.266, and rounded would become 1.27

In what I would consider a plausible worst case is if the remaining days gefs predictions are on average 0.1 warmer than ERA5 (requiring an offset of -0.1) ... This gives a corrected gistemp prediction of:

` 1.263 C +-0.098`

Subtracting again 0.016 leads to 1.247 C, rounded to 1.25 C. This scenario seems more in line with polymarket's current odds (which adding up the >=1.25+ bins gives 53%).

Month is still not half over but here is what I got so far for gistemp on breaking the August record:

Keeping in mind the GEFS prediction error has been mostly positive as of late (as why the offsets are mostly > 0):

The second half of July temps really upset the predictions as before the mean/median was fairly close to 0 (the negative prediction error corresponds to the predictions (all past GEFS predictions for that day) that were cooler what ERA5 was for that day):

(Past 15 days (not shown) -- so the last few days of the month with no data -- I use some guesswork predictions from the prophet library to fill in the data)

For fun, here is the NOAA chart for August breaking it's month's record (1.09 C) (it was weird that NOAA's anomaly for August 2023 was much lower that GISTEMP):

With a day left the probabilities for me have converged to a middle of about **~65%** after a large drop in the last week, as the meta prediction I made 10 days ago seemed ill advised in hindsight as I had ~60% 10 days ago (the gefs-era5 error didn't continue for too long and actually flipped the other way in the last few days).

Adding up the bins from polymarket puts >= 1.25 C at **70%** (this corresponds to a bin that doesn't exist of >=1.245 on this question but its within range of this prediction).

After giving up on trying to recreate SBBX from the nc file, I finally ran GISTEMP for the first time with all its own data (after fixing a bug in the code for empty station records (only excluded two land stations)?), but I get very different numbers for the months... I do only see one warning:

.local/lib/python3.9/site-packages/numpy/lib/npyio.py:716: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.

val = np.asanyarray(val)

Has anyone else successfully run it to get the same numbers? I know @aenews has, but has anyone else needed to modify the python code?

Yeah, deprecated code. 3.9 may work fine, but I'd recommend using Python 3.4 - Python 3.8 and preferably the oldest version of NumPy available if using anything 3.5 and newer.

You need to make some minor adjustments to get the same numbers.

And regarding the ERSST, it's pretty tricky. I don't think what I have rn is that great. Could be better.

@majorj Thanks! That makes a lot of sense. Going to run the it on old python (3.4.5) and the modified one I've edited to run on python3.12 to see what I get now...

@majorj Ok thanks for the help; I don't think I would have figured this one point without your help. I assumed the data sources txt was out of date rather than the code needed to be modified to exclude those stations.

I've put the modified gistemp code to run on the new python/numpy here since its so much faster to run with pyp3: https://github.com/JRPdata/gistemp4.0/

I've only briefly checked global LOTI and (barring a single month's error of 0.01 C) in the new modified version against the GISS reference .csv for global monthly means and it is very close. I also tried running the original unmodified code on an older python environment (python 3.4.5 and numpy 1.8.2; the numpy conda package is hard to find and can be retrieved: conda install -c pydy --no-pin numpy=1.8.2) .

I haven't checked any of the other files so I can't say if it has any other problems as I'm only interested in global LOTI.

As for generating ERSSTv5 SBBX, I've posted the comparisons of different area methods for generating the ERSSTv5 SBBX with the updated code for the new python3. Generating the ERSSTv5 SBBX from the public data has **many** 'small' errors: about 100 months have between 0.01C or 0.02C error in the global monthly means. Comparisons here with different methods:

https://github.com/JRPdata/gistemp4.0/tree/main/ersst_calc_comparisons

They don't seem to use a sea ice cutoff as in step4 code might suggest. The closest values I get is with the weighted averaging using spherical geometry area calculations for the regridding of the subgrid boxes (which is suggested by some of the interpolating fortran code). Projecting area calculations are comparable though, but not better.

It also has some utilities in python some of you might find useful besides generating an approximate ERSST SBBX from the PSL netCDF4, including SBBX_to_txt.py which is a python translation of the fortran code. There are also some inspection tools relating to SBBXs (including graphing the SBBX txt boxes on a map) as well as comparison tools for comparing your csv against a reference (the histogram graphics), and other random inspection tools.

@aenews I've been trying to produce the ERSSTv5 data and seemingly the only problem I have is the calculation of the baseline.

It is supposed to be 1951-1980 according to the doc and I believe I am calculating the temperatures correctly, but the anomaly temperatures I get are at a fixed constant different for each (subgrid box, month) pair across the years suggesting I don't know how they actually calculate the baseline.

Edit: I've already inspected the absolute temperatures and anomaly temperature dataframes manually to calculate the average for a subgrid box interactively, but everything seems correct on my end...