Downtime

maintenance of grunch

mhu027 • April 8, 2022

Dear Grunch users,

We plan to have maintenance and downtime of grunch.hpc.uib.no for about an hour this Monday. Even as they share some of the same filesystems, only grunch will be restarted, and cyclone will not be affected.

Edit 2022-04-11 15:20 : We cancelled the maintenance and it won't happen anymore this or next week. When it will, we'll put up a new notice. Sorry for the false alarm.

Best regards,

The IT department and the Scientific Computing Group

cyclone not accessible [FIXED]

mhu027 • September 15, 2021

Dear users of cyclone,

cyclone.hpc.uib.no is inaccessible at the moment. We will probably need to restart the system.

More information will follow during the course of today.

edit 2021-09-15 13:00: Because of unresponsiveness of the system we needed to restart cyclone.hpc. We do not know the cause yet and are still looking into it. You can use the system again. Please, let us know if it looks like there is still something wrong with cyclone.

Best regards,

The IT department and the Scientific Computing Group

access to cyclone temporarily disabled

mhu027 • June 17, 2021

Dear users of cyclone,

We will have an emergency downtime until further notice because of issues with the Lustre filesystem on cyclone.hpc.uib.no. Please, save any work and log out.

Feel free to contact us via hjelp.uib.no if you have any questions.

Apologies for the inconvenience.

Update 18.06: cyclone is up and running. Yesterday we thought that there might be something wrong with the local Lustre filesystem, which includes /home and /Data/gfi/stormrisk. To reduce the risk of data loss, we brought the systems offline. However, everything appears to be working reliably after all.
In addition, the backup of /home does not always finish. We are still looking into this problem, but you may login and use cyclone again.

Best regards,

The IT department and the Scientific Computing Group

Downtime: power outage in machine room 16 January

mhu027 • January 19, 2021

Dear grunch and cyclone users,

Last Saturday there was an unexpected powercut in our machine room at Thormøhlensgate 55. As a consequence, cyclone and grunch were shutdown at about 15:40, then came up again around 17:50.

We are sorry for the inconvenience and do our best to minimise downtimes.

Best regards,

The Scientific Computing Group

Downtime: Machine room power outage at Thormøhlensgate 55 April 17~19.

saerda • April 2, 2020

Dear Cyclone and Grunch Users,

Our machine room at Thormøhlensgate 55 will have power outage during Friday 17 April 15:00 to Sunday 19. April 24:00.

There will be planned work to do maintenance for power line in the building.
Because of that maintenance, we have to take down all our servers at 15:00 Friday 17. April and hopefully we will take all servers online before 08:00 Monday 20. April.

cyclone and grunch server will be taken down and will not be accessible during this time.

/shared/ filesystem which is exported via leo.hpc.uib.no to the university campus will not be accessible during this time.

We advice all users to plan their work in good time to avoid unnecessary problems.

Update20:30_19.04.2020: Our downtime is over, cyclone and grunch is back online.

Please contact us via hjelp.uib.no if you have any further questions.

Best Regards

Scientific computing team.

Cyclone maintenance 25th September

saerda • September 24, 2019

Cyclone will be taken down tomorrow, 25th September, from 12:00 until 14:00 for regular maintenance. We will perform OS related updates and some core libraries are going to be updated too. more information will be posted.
Update 25.09.2019 14:55 cyclone.hpc.uib.no is back online.

Hexagon: urgent reboot is needed

lsz075 • December 3, 2018

Update 2018-12-03 12:36:

Hexagon is up now.
Interconnect errors are cleared now and /work file system is up and functional again.
Unfortunately the previously submitted jobs had to be canceled. Please resubmit your jobs.

Dear Hexagon User,

We must reboot Hexagon due to repeated errors on the interconnect.
Will update this case when Hexagon is up and functional again.

Hexagon stopped due to power loss

saerda • February 5, 2018

Hexagon stopped yesterday due to electric power loss. This morning Hexagon is online again.

Hexagon reboot after power blink

Alexander Oltu • December 8, 2017

After a series of power blinks, Hexagon high performance network, as well as some nodes are in inconsistent state. We have to restart whole machine.

All local HPC will be down for ~2 weeks starting tomorrow

Alexander Oltu • November 20, 2017

We are reminding you that tomorrow morning (2017.11.21) Hexagon and Fimm are going to be shut down for the reconfiguration.

ALL DATA on /work, /work/shared (/work-common), /home and /fimm filesystems will be deleted.

Please find more details at https://docs.hpc.uib.no

HPC Syslog

Log over changes and events on UiB's HPC systems

Downtime

maintenance of grunch

cyclone not accessible [FIXED]

access to cyclone temporarily disabled

Downtime: power outage in machine room 16 January

Downtime: Machine room power outage at Thormøhlensgate 55 April 17~19.

Cyclone maintenance 25th September

Hexagon: urgent reboot is needed

Hexagon stopped due to power loss

Hexagon reboot after power blink

All local HPC will be down for ~2 weeks starting tomorrow