> home

berkeley

prev | random | next


##### ark:/13030/hb8q2nb82r #####
Failure Analysis of Internet Services
We present operational characteristics and failure data of several large-scale Internet services. Case studies and data are used to broaden the range of metrics used in this analysis. We found that operator-induced errors are most impacting and also the hardest failure to mask. Failure-mitigation techniques, such as configuration checking, online testing as well as fault/load injection, improve Internet service availability
technical reports, hb1k40068r

[t320] pattern fault recovery failure tolerance reliability failures tolerant fail redundancy fault_tolerant detection reliable recover availability redundant detect critical occur recovering

[t478] web internet server access client file user online information request files services proxy page service browser usage caching resources world

[t289] health service services care provider health_care public_health community need mental promotion program public delivery medical rehabilitation national access disabilities professional

[t473] specification formal verification checking diagram semantic properties tool refinement petri_net symbolic timed notation uml abstract techniques implementation correctness formalism concurrent