Search Archive:
336 Views NWS Data Outage [resolved]

Link: https://mesonet.agron.iastate.edu

The National Weather Service is experiencing an ongoing data flow outage impacting things like IEM NEXRAD composites, etc. No ETR at the moment and will update the news item as things progress.

Updated 8:40 PM: The outage lasted between about 3 PM and 5:15 PM CST. Depressingly, it appears all data was lost for that period. Will see what can be done though.

Final Update: Sadly, the data from this outage is generally unrecoverable. Sigh.

357 Views ISU DNS Outage [resolved]

Link: https://status.it.iastate.edu/incidents/335386

Iowa State University had a Domain Name Service (DNS) outage lasting between about 5 and 6 PM on 10 January 2022. This outage broke a number of data flows processed by the IEM, but services to external users should have still been available. Not much can be done in these situations other than thank you for your patience!

354 Views NWS Model Data Issue

Link: https://mesonet.agron.iastate.edu

Since yesterday (5 January 2022), the NWS has been experiencing a significant outage of its College Park, MD datacenter. This outage is causing various NCEP model datasets and others to not be available for processing. It is difficult to quantify the troubles this is causing IEM processing, but please be aware if you see various products in an error state :( The current NWS ETR is tomorrow (7 January 2022) morning.

Updated 6 Jan 2 PM: The NOMADS NCEP site is working and I have moved some processing over to it, so more data is flowing now than before. The NWS has updated that the College Park outage is now hoped to be fixed by Saturday morning.

Updated 10 Jan 9 AM: Everything should be back to normal now.

393 Views 23 Sept Outage [resolved]

Link: https://mesonet.agron.iastate.edu

There was a partial outage of the IEM website between about 7:30 and 8:15 PM on 23 September 2021. A likely innocent web API user sent a few thousand simultaneous requests at a brittle website component that ended up exhausting resources for the Fast-CGI PHP server. I have added some additional code to help monitor this and keep the brittle resource from being overwhelmed. Sorry about the outage and thanks for the patience!

774 Views 25 May Maintenance Outages

Link: https://mesonet.agron.iastate.edu

There are two maintenance outages scheduled for 24 May 2021 as summer time means disruptive changes are done!

ISU Network Engineers plan to do network updates between 5-7 AM with possible network outages during this time. The primary impact for the IEM would be you can't reach the website and real-time data ingest may be delayed or fail.

The central storage mtarchive uses will be down for most of the day for yearly updates and patching. I have some workarounds in place to prevent data loss, but various services that use that archive will be failing. This is not too impactful for the IEM nor its services.

I'll update this news item on Tuesday as these windows are cleared and services get restored. Thanks for your patience.

Updated 9:30 AM: The network outage lasted between 5:30 and 6:10 AM this morning with no known impacts other than some lost data from live platforms, like web cams. The mtarchive downtime is now ongoing and will update this news item later today. Thanks for your patience.

Updated 11 AM: The mtarchive outage is over, but a filesystem scrub operation is ongoing and typically takes a few days to complete. The operation causes some performance issues in the iterim.

623 Views 30 March NWS Data Outage [resolved]

There was a significant National Weather Service network outage last night that impacted a number of data flows into the IEM. I am still attempting to access the extent of the various data holes, but am on the case! Will update this news item once more is known.

Updated 10 AM: Not much for details has been released for this outage. There are various data holes that are likely not repairable as there is no upstream source. Oh well.

433 Views 20 Mar NWS Data Outage [resolved]

Link: https://mesonet.agron.iastate.edu/wx/afos/p.php?pil=ADASDM&e=202103200822

There was an outage of the NWS NOAAPort broadcast (which the IEM uses as the data source for many NWS products) between about 2 and 4 AM this morning (20 March). Unfortunately, the NWS indicates that all data during this period was lost, so alas. I will try some tricks to fill in various holes left, but some of them are not fixable. Sorry for the outage.

777 Views ASOS 1 Minute Data Outage [resolved]

The IEM processes one minute interval data from ASOS sites via two sources. The near real-time source is the MADIS One Minute ASOS feed which provides data at five minute intervals (confusing eh?). The slightly delayed feed comes from NCEI. Both feeds have been somewhat struggling over the past month, but the MADIS one has been mostly down.

A contact I have at MADIS said that the issue is upstream of them and they continue to actively request a resolution. I'll update this news item once I hear more information about this data feed.

Update 10 Dec 2020: The MADIS data flow appears to be more stable now, but the folks at NCEP IDP are unsure if the upstream issue has been fully resolved or not. They continue to monitor.

Update 14 Dec 2020: The MADIS data flow appears to have stopped again, no updates as to what is happening.

Update 29 Jan 2021: The NCEP IDP folks believe this issue to be resolved.

796 Views 10 August Derecho Outage

Link: https://mesonet.agron.iastate.edu

Nothing short of a crazy day today. I'll update this news item frequently as I make progress restoring the IEM and filling the data void left today due to the Ames/ISU power outage from the Derecho of 2020!

11:21 PM: Power is back and services are getting back on their feet. Lots of issues yet and data holes to fill.

Updated 11 Aug 12:04 AM: The IEM should be functioning again for realtime requests. The AFOS (text product) database hole should now be filled. This is the first step in repairing other datasets on this genre. The extent of the outage was from about 11 AM this morning until 10 PM this evening.

Updated 11 Aug 12:22 AM: Some ISU ITS managed virtual machines remain offline and there is cooling capacity issues in the central data center. This marginally impacts the IEM, but some archiving is not working at the moment due to it.

Updated 5:30 AM: No power at my house is creating for a slow go with getting more of the data holes repaired. Will do what I can and keep this news item updating.

Updated 10:50 AM: Maybe I now have the hole in the VTEC (NWS Watch Warning Advisory) database and Local Storm Report (LSR) database filled. It is difficult to tell with my poor network working conditions at the moment.

Updated 8:45 PM: The logistical nightmare continues and repairs to the holes is slow going. Perhaps I have the RIDGE imagery backfilled now and will soon have the NEXRAD composites repaired. It is hopeless for me at the moment to catch up on emails and twitter mentions. Thanks for your patience.

Updated 12 August 4:30 AM: I believe the hole with the ASOS/METAR data should now be filled. I am getting questions about peak gusts in Iowa for the event and I can not really answer them until I get these holes repaired. Filling the RWIS one may not be possible as I am unsure if the DOT, who is located in Ames, was able to collect data during the Ames power outage. I will reach out to them soon.

Updated 8:05 AM: Repaired NEXRAD storm attribute archive hole.

Updated 10:35 PM: Not much for updates to share. I continue to struggle with logistics due to no power and Internet at home. Thankfully, power and cooling at work have been stable.

Update 13 Aug 12:00 PM: I have power and internet at home now, so can get after this task some more!

Update 13 Aug 4:00 PM: A previous backfill of VTEC / Storm Based Warning products did not go as planned. This was cleaned up and should be all square now.

Update 16 August 2:20 PM: I have now repaired the hole with the SPC Day 1 outlooks and MCDs.

626 Views 8 April Partial Outage [resolved]

One of the servers in the IEM computing cluster suffered a fatal power supply failure on the evening of 7 April 2020. Sadly, it happened just after I went to bed and I slept through the alarms until I awoke at 2:30 AM (coronavirus quarantine and working from home has my sleep schedule all wonky). I trucked it into work and got it repaired shortly after 4 AM. I am still accessing the data holes that may be present from this outage and will update the news item with what I figured out. Sorry for the troubles!

Update 10 AM: The missed RIDGE imagery has been generally fixed with the most commonly used products processed.

952 Views 2-3 Jan Website Troubles [resolved]

Link: https://mesonet.agron.iastate.edu

The IEM website has suffered a number of puzzling outages over the past two days. I think I have finally isolated the problem to an Apache web server configuration issue. I recently migrated the web farm to Red Hat Enterprise Linux 8 and failed to apply the "event" MPM configuration vs the default of "prefork". Hopefully the bumpiness is behind us now and things will back to performant normal!

Thanks for your patience!

585 Views 16 Dec Website Degradation [resolved]

Link: https://mesonet.agron.iastate.edu

An important disk unexpectedly ran out of inodes and caused a strange cascading failure that knocked out the IEM website between about 5 AM and 8:30 AM on 16 December 2019. I thought I had things under control shortly after the issue started, but the cascading failure caused the NAT gateway to not properly operate and things went downhill for a Monday morning after that. Will boggle this all and try to figure out how to keep it from happening again! Thanks for your patience.

956 Views IEM Service Degration [resolved]

Link: https://mesonet.agron.iastate.edu

One of the back end file servers failed horribly just after 7:30 PM this evening (1 December 2019) taking with it a fair amount of data and processing services. Am attempting to get bandages put on the various data flows before accessing if there is any hope in the morning for fixing it. Will update this news item tomorrow with an update. Thanks for your patience.

Update 8:30 AM 2 Dec 2019: Thanks to some help from a collegue onsite last night, we were able to cold restart a file server and get it back online in a degraded state. This morning, I replaced the failed hard drive and the system is back rebuilding redundancy. Everything should be functioning normally, but if you see trouble, please let me know!

965 Views 13 Nov Website Troubles [resolved]

Link: https://mesonet.agron.iastate.edu

An important backend network file service jammed up early this morning and let to various cascade failures of IEM services. We should be back to normal now and will be reviewing any data holes for repair. Sorry for the troubles and thanks for the patience.

967 Views Database Service Upgrade

Link: https://mesonet.agron.iastate.edu

The backend database server for the IEM was updated yesterday to version 12.0 and PostGIS 3.0beta4. While the upgrade went smooth, there are some performance issues being had currently. I'll update this news item once the issues have been resolved. Thanks for your patience.

Update 11 AM 13 Oct 2019: Had some trouble this morning with the secondary database being overwhelmed with connections. I am still reorganzing the backend datases to workaround some troubles found with the recent upgrade. The fun never stops with this and I am thankful for your patience.

Update 9 PM 15 Oct 2019: Things have generally been stable, but am still not running the configuration I would like. Am awaiting for some new upstream packages to be made available and will see if the performance issues go away with them.

789 Views 27 Sep Website Outage [resolved]

Link: https://mesonet.agron.iastate.edu

The IEM website was degraded from about 3 AM till 8 AM this morning due to an ugly cascading failure. The current theory is that a Red Hat Enterprise Linux 7 bug caused a NFS client with failed memory DIMM to lock up an important NFS server. This server then jammed until the client was physically restored by me. The fun never stops. Hopefully this won't happen again! Thanks for your patience.

793 Views Sep 26 Planned Network Outage

Link: https://mesonet.agron.iastate.edu

ISU Networks, Operations, and Communications will be doing building network switch replacement between 3 and 7 AM on 26 September 2019. A number of network outages lasting up to an hour are expected during this window. I'll update this news item once the work is completed. Thanks for your patience.

Update 26 Sep, 7:11 AM: We are getting back to normal and will be attempting to repair various data holes while the network was out. Please let me know of any issues you are seeing.

1169 Views 20 July Power Outage

Link: https://mesonet.agron.iastate.edu/

At about 12:30 PM CDT on 20 July 2019, much of Ames and Iowa State University lost power. This took out all IEM computing resources :) At about 2:03 PM, power was restored and about 2:30 PM I was able to get all IEM computing back on its feet. There are still some sick data flows and services and will update this news item as things are repaired. Thanks for your patience.

Update 8:30 PM: After great gnashing of teeth, everything should be back to normal now. A data hole still exists that I will repair over the coming days. Thanks for your patience.

Update 22 July, 8:00 AM: Issues continue to be found and fixed, but please let me know if there is something still broken. At this point, figuring out what is broken is the toughest portion of the battle.

1051 Views Internet Outage Jul 3 [resolved]

There will be a brief outage of Internet for the IEM starting at about 6 AM on 3 July 2019. The outage will hopefully resolve some bandwidth issues that have been plaguing the web farm and IEM services for a number of months now. Should be back up within 20 minutes. Will update the news item once completed. Thanks for your patience.

Update 6:21 AM: The outage was from about 5:57 AM till 6:08 AM. Sadly, the underlying issue was not resolved with this outage.

1352 Views IEM Outage [resolved]

Link: https://mesonet.agron.iastate.edu/

I just woke up and am trying to collect information, but ISU suffered some sort of power dump overnight and I have a mess to clean up. Will post updates here as I get things repaired!

Update 9:30 AM: The servers are all back on their feet, but I have a small data hole to plug with the HADS and METAR data. Will update this news item again once that is repaired.

Update 3 PM: The SHEF and METAR holes should be repaired and no known issues remaining.

866 Views Iowa RWIS Outage [resolved]

Link: https://mesonet.agron.iastate.edu/RWIS/

The Iowa DOT provided RWIS data has not updated since Sunday (8 Apr 2019) morning. There is some issue with the data flow on their end that is being actively worked on. Will update this news item once the feed has been resolved.

Update 9 PM: Data flow was fixed about 3 PM this afternoon.

1384 Views mesonet-nexrad service outage/change

Link: https://mesonet-nexrad.agron.iastate.edu

The Mesonet Level II service provides open access to the National Weather Service "Level II" NEXRAD data. This service is being moved on 8 January 2018 to a new virtual home in the data center.

Firstly, there will be an outage of some duration starting around 8 AM 8 Jan 2018 as a physical move happens and then network reconfiguration happens. Hopefully this outage can be limited in time, I will update this news item once done.

Secondly, the service will be available from a different IPv4 and IPv6 address. The domain name will not change, but perhaps some folks have IP based firewalls in place. The new IP addresses will be 129.186.90.3 and 2610:130:108:480::3.

This service will also default to HTTPS going forward.

Hopefully all of these changes are transparent to users. Famous last words! :)

Updated 8 Jan, 10:45 AM: Some network complications were found prior to the move starting, so the old service remains running for now until attempt number two is hopefully made later today. Thanks for your patience.

Updated 8 Jan, 2:45 PM: I believe we are up and stable with the new setup. If you are having trouble accessing, please let me know!

1087 Views 23 Nov: NWS Data Outage [resolved]

Link: https://mesonet.agron.iastate.edu

There is an ongoing outage of data from the National Weather Service. Will update this news item once it is resolved. No word from the NWS what the issue is or when it will be fixed.

Update 2:20 PM: The NWS reports the issue to be resolved, but there was a significant hole that will not appear to be possible to backfill.

962 Views 17 July NWS Data Outage [resolved]

Link: https://mesonet.agron.iastate.edu/

The data flow from the NWS stopped at about 11:30 AM 17 July 2017. No ETA on a fix. This impacts lots of IEM services. Will update the news item once it is repaired.\n

Resolved as of 12:07 PM.

1613 Views 13 Feb NWS Data Outage [Resolved]

Link: https://www.washingtonpost.com/news/capital-weather-gang/wp/2017/02/13/weather-service-suffers-catastrophic-outage-website-stops-sending-forecasts-warnings

The flow of National Weather Service data has been down for the past few hours. This is a nationwide outage of their satellite system, so there are lots of folks in this same boat. Will update this message once it is resolved upstream.

Update 3:30 PM: Data is flowing again and the extent of the outage was from approximately 12 to 3 PM. I suspect some data is lost forever.

950 Views 13 Dec METAR Data Outage

There has been a fire at some location important to the relay of METARs to the world. This relay is down until further notice, so it appears many METAR / Airport / ASOS+AWOS sites will be unavailable in the interim! Will update this NEWS item with any further details I get.

Here's a plot of my monitoring showing the downturn in available sites:

Updated 9 PM: Most sites are back now, but am unsure of when the full restoration will happen. There was a problem with a telco location in Omaha that caused this outage.

779 Views ISU Internet Outage [resolved]

Link: https://www.it.iastate.edu/cel/view/2242

Iowa State University lost Internet access Sunday, 20 Nov 2016, between 6:10 and 8:21 PM CST. An outage of this duration does cause data loss for the IEM project, but I will make an attempt to repair some of the holes caused. No word on why the border routers failed.

1436 Views Sept 25th Internet Outages

Link: http://www.inside.iastate.edu/article/2016/09/15/upgrades

ISU Network folks will be doing maintenance on Sunday, 25 September 2016. They expect a number of outages during the day as they replace routers, etc. So availability to the IEM will be up and down during the day with actual outages not expected to last for too long.

Update 10 PM 26 September: So firstly, an appology for the prolonged issues that occured. It did not help the situation that I was traveling and could not connect myself to the IEM servers during the outage. Full network service was restored around about noon today. On Sunday, the 1 hour initial outage in the morning was known and expected. What happened after about 11 AM was not. There was some router config issue that was preventing returning traffic from the IEM to reach the clients during this time. When these types of issues happen, I post updates to my @akrherz Twitter account.

1481 Views Network Outage [resolved]

Link: https://www.it.iastate.edu/cel/view/2141

There was an ISU Internet outage this morning that prevented most folks from accessing the IEM website between 5:00 and 5:27 AM this morning (5 April 2016).

1383 Views Network Outage [resolved]

Link: https://www.it.iastate.edu/cel/view/2115

The IEM was unavailable this morning (3 March 2016) between 4:30 and 5:03 AM due to scheduled network maintenance.

2038 Views 17 Jun METAR Outage [resolved]

Link: http://mesonet.agron.iastate.edu/ASOS/current.phtml

There was a rather large METAR data outage last night due to issues at the Federal Aviation Administration (FAA). The FAA collects the airport weather station data (in METAR format) and disseminates it to the National Weather Service, from whom the IEM collects the data from. It is doubtful that I'll be able to repair the hole in the archive from this outage.

The outage lasted from just before midnight this morning to about 5:30 AM. Here's a plot from my monitoring showing global METAR station counts with the number of stations reporting within the past hour plotted.

2627 Views 27 April Outage [resolved]

Link: http://mesonet.agron.iastate.edu/

The IEM website was very slow or unavailable for a period between about 3:20 and 4:30 PM on Monday, 27 April 2015. This was due to a cascade failure as a backup database server flooded a file server with IO requests and that slowed down another process that reads data from that server. Oye. The primary database server is about an order of magnitude faster than the backup server, so write loads the primary server generates sometimes slows down the backup server.

I am moving the backup database instance to a different disk system to prevent this from happening in the future. Thanks for your patience.

2779 Views 16 April Outage [resolved]

Link: http://mesonet.agron.iastate.edu/

The IEM website was very slow or unresponsive between 1-2 PM on Thursday, 16 April 2015. I was sitting in a meeting at the time and did not immediately notice that bad things were happening. The issue was a cascading failure of my tilecache service due to a failure to generate N0R RADAR Composites during this time. The lack of current radar data was causing most of the incoming requests to bypass a caching layer and instead hit the mapserver backend, which was quickly overwhelmed with work.

I have made some changes to hopefully prevent this from happening again, thanks for your patience.

3761 Views ISU Network Issues [resolved]

Link: https://www.it.iastate.edu/cel/view/1934

There are unknown issues with ISU's network this morning causing trouble with various IEM services. I will update this news item as I find things out. The most noticable impact was for users being unable to connect to the Level II radar server.

Update 8 PM: It took a while to get everything back on the network after ISU had DHCP server issues over night. We should be back at full strength now. Thanks for your patience.

3792 Views 13 Nov Internet Outage [resolved]

Link: https://www-it.sws.iastate.edu/cel/view/1891

ISU lost Internet access for a number of minutes a bit after 2 PM today. The local IT folks say a soon to be replaced router failed.

3135 Views 10 Oct Internet Outage [resolved]

Link: https://www-it.sws.iastate.edu/cel/view/1875

Internet for the entire ISU campus was out between 2:47 and 3:22 PM today (10 October). No word on what the issue was or if we are stable now.

4:15 PM update A network misconfiguration was made resulting in the campus wide outage. We should be stable now though.

2949 Views Sep 22 Internet Outage [resolved]

Link: http://mesonet.agron.iastate.edu/

Our local network/Internet was down between 10:05 and 10:12 AM this morning. Unsure what is going on other than a local building router is sick.

2651 Views 9/18 RADAR Data Outage [resolved]

Link: http://mesonet.agron.iastate.edu/

Our upstream source of NWS RADAR data is currently off the Internets, so we are having an outage of RADAR products. Have not heard any details on what the issue is other than network outage!

Update 2:30 PM: The upstream source returned at 2:26 PM and I have mostly repaired the missed data during the outage.

1402 Views 17 Aug - Internet Outages

Link: https://www-it.sws.iastate.edu/cel/view/1850

There were two internet outages this morning as ISU continues work on the network backbone upgrades. There is a light at the end of the tunnel and the hope is that the work is mostly done now.

1336 Views 13 Aug - Internet Outage

Link: https://www-it.sws.iastate.edu/cel/

There was a brief Internet outage between 12:20 and 12:30 PM today as some local network issue occurred. I assume this is related to continued work on local upgrades.

1440 Views 11 Aug - Internet Outage

Link: http://mesonet.agron.iastate.edu/

We lost Internet connectivity around 10:10 AM. It is not clear what is currently broken, but I have hacked around it by hard coding a network routing path. Unsure how stable this config is, but things are working again at the moment (10:45 AM).

1765 Views 7 AM - 9 June Network Outage

Link: https://www-it.sws.iastate.edu/cel/view/1777

A routine ISU network maintenance event this morning for the Agronomy building did not go as hoped and resulted in a prolonged outage between about 6:50 and 7:30 AM this morning.