API: Automatically repair broken packages

This partially reverts #7150 and avoids exceptions
inside the flow. Each time an empty active stage
is detected, Icinga tries to repair it from the
the given directory tree.

Also, the code now takes into account that it should
create the package storage on startup, whether within
the API object, or if disabled, inside the application.

Caching the active stages for packages in memory
only is in effect with the API feature being enabled.
This is useful for other deployed config packages,
not only the internal one.

fixes #7173
refs #7150
refs #7119
fixes #6959
This commit is contained in:
Michael Friedrich 2019-05-10 12:48:34 +02:00
parent 98039e88b4
commit 6cce9c0fdd
7 changed files with 106 additions and 38 deletions

View File

@ -780,7 +780,7 @@ Wrong:
Correct:
```
/var/lib/icinga2/api/packages/_api/abcd-ef12-3456-7890/conf.d/downtimes/1234-5678-9012-3456.conf
/var/lib/icinga2/api/packages/_api/dbe0bef8-c72c-4cc9-9779-da7c4527c5b2/conf.d/downtimes/1234-5678-9012-3456.conf
```
At creation time, the object lives in memory but its storage is broken. Upon restart,
@ -792,16 +792,17 @@ read by the Icinga daemon. This information is stored in `/var/lib/icinga2/api/p
2.11 now limits the direct active-stage file access (this is hidden from the user),
and caches active stages for packages in-memory.
Bonus on startup/config validation: Icinga now logs a critical message when a deployed
config package is broken.
It also tries to repair the broken package, and lots a new message:
```
icinga2 daemon -C
systemctl restart icinga2
[2019-04-26 12:58:14 +0200] critical/ApiListener: Cannot detect active stage for package '_api'. Broken config package, check the troubleshooting documentation.
tail -f /var/log/icinga2/icinga2.log
[2019-05-10 12:27:15 +0200] information/ConfigObjectUtility: Repairing config package '_api' with stage 'dbe0bef8-c72c-4cc9-9779-da7c4527c5b2'.
```
In order to fix the broken config package, and mark a deployed stage as active
If this does not happen, you can manually fixthe broken config package, and mark a deployed stage as active
again, carefully do the following steps with creating a backup before:
Navigate into the API package prefix.
@ -820,7 +821,7 @@ ls -lahtr
drwx------ 4 michi wheel 128B Mar 27 14:39 ..
-rw-r--r-- 1 michi wheel 25B Mar 27 14:39 include.conf
-rw-r--r-- 1 michi wheel 405B Mar 27 14:39 active.conf
drwx------ 7 michi wheel 224B Mar 27 15:01 abcd-ef12-3456-7890
drwx------ 7 michi wheel 224B Mar 27 15:01 dbe0bef8-c72c-4cc9-9779-da7c4527c5b2
drwx------ 5 michi wheel 160B Apr 26 12:47 .
```
@ -832,16 +833,22 @@ directory. Copy the directory name `abcd-ef12-3456-7890` and
add it into a new file `active-stage`. This can be done like this:
```
echo "abcd-ef12-3456-7890" > active-stage
echo "dbe0bef8-c72c-4cc9-9779-da7c4527c5b2" > active-stage
```
Re-run config validation.
`active.conf` needs to have the correct active stage too, add it again
like this. Note: This is deep down in the code, use with care!
```
icinga2 daemon -C
sed -i 's/ActiveStages\["_api"\].*/ActiveStages\["_api"\] = "dbe0bef8-c72c-4cc9-9779-da7c4527c5b2"/g' /var/lib/icinga2/api/packages/_api/active.conf
```
Restart Icinga 2.
```
systemctl restart icinga2
```
The validation should not show an error.
> **Note**
>

View File

@ -123,12 +123,13 @@ directory path, because the active-stage file was empty/truncated/unreadable at
this point.
2.11 makes this mechanism more stable and detects broken config packages.
It will also attempt to fix them, the following log entry is perfectly fine.
```
[2019-04-26 12:58:14 +0200] critical/ApiListener: Cannot detect active stage for package '_api'. Broken config package, check the troubleshooting documentation.
[2019-05-10 12:12:09 +0200] information/ConfigObjectUtility: Repairing config package '_api' with stage 'dbe0bef8-c72c-4cc9-9779-da7c4527c5b2'.
```
In order to fix this, please follow [this troubleshooting entry](15-troubleshooting.md#troubleshooting-api-missing-runtime-objects).
If you still encounter problems, please follow [this troubleshooting entry](15-troubleshooting.md#troubleshooting-api-missing-runtime-objects).
## Upgrading to v2.10 <a id="upgrading-to-2-10"></a>

View File

@ -288,6 +288,9 @@ int DaemonCommand::Run(const po::variables_map& vm, const std::vector<std::strin
Logger::DisableConsoleLog();
}
/* Create the internal API object storage. Do this here too with setups without API. */
ConfigObjectUtility::CreateStorage();
/* Remove ignored Downtime/Comment objects. */
try {
String configDir = ConfigObjectUtility::GetConfigDir();

View File

@ -7,6 +7,7 @@
#include "remote/jsonrpc.hpp"
#include "remote/apifunction.hpp"
#include "remote/configpackageutility.hpp"
#include "remote/configobjectutility.hpp"
#include "base/convert.hpp"
#include "base/defer.hpp"
#include "base/io-engine.hpp"
@ -135,6 +136,9 @@ void ApiListener::OnConfigLoaded()
Log(LogWarning, "ApiListener", "Please read the upgrading documentation for v2.8: https://icinga.com/docs/icinga2/latest/doc/16-upgrading-icinga-2/");
}
/* Create the internal API object storage. */
ConfigObjectUtility::CreateStorage();
/* Cache API packages and their active stage name. */
UpdateActivePackageStagesCache();

View File

@ -10,15 +10,21 @@
#include "base/dependencygraph.hpp"
#include "base/utility.hpp"
#include <boost/algorithm/string/case_conv.hpp>
#include <boost/filesystem.hpp>
#include <boost/system/error_code.hpp>
#include <fstream>
using namespace icinga;
String ConfigObjectUtility::GetConfigDir()
{
/* This may throw an exception the caller above must handle. */
return ConfigPackageUtility::GetPackageDir() + "/_api/" +
ConfigPackageUtility::GetActiveStage("_api");
String prefix = ConfigPackageUtility::GetPackageDir() + "/_api/";
String activeStage = ConfigPackageUtility::GetActiveStage("_api");
if (activeStage.IsEmpty())
RepairPackage("_api");
return prefix + activeStage;
}
String ConfigObjectUtility::GetObjectConfigPath(const Type::Ptr& type, const String& fullName)
@ -33,6 +39,59 @@ String ConfigObjectUtility::GetObjectConfigPath(const Type::Ptr& type, const Str
"/" + EscapeName(fullName) + ".conf";
}
void ConfigObjectUtility::RepairPackage(const String& package)
{
/* Try to fix the active stage, whenever we find a directory in there.
* This automatically heals packages < 2.11 which remained broken.
*/
namespace fs = boost::filesystem;
fs::path path(ConfigPackageUtility::GetPackageDir() + "/" + package + "/");
fs::recursive_directory_iterator end;
String foundActiveStage;
for (fs::recursive_directory_iterator it(path); it != end; it++) {
boost::system::error_code ec;
const fs::path d = *it;
if (fs::is_directory(d, ec)) {
/* Extract the relative directory name. */
foundActiveStage = d.stem().string();
break; // Use the first found directory.
}
}
if (!foundActiveStage.IsEmpty()) {
Log(LogInformation, "ConfigObjectUtility")
<< "Repairing config package '" << package << "' with stage '" << foundActiveStage << "'.";
ConfigPackageUtility::ActivateStage(package, foundActiveStage);
} else {
BOOST_THROW_EXCEPTION(std::invalid_argument("Cannot repair package '" + package + "', please check the troubleshooting docs."));
}
}
void ConfigObjectUtility::CreateStorage()
{
boost::mutex::scoped_lock lock(ConfigPackageUtility::GetStaticPackageMutex());
/* For now, we only use _api as our creation target. */
String package = "_api";
if (!ConfigPackageUtility::PackageExists(package)) {
Log(LogNotice, "ConfigObjectUtility")
<< "Package " << package << " doesn't exist yet, creating it.";
ConfigPackageUtility::CreatePackage(package);
String stage = ConfigPackageUtility::CreateStage(package);
ConfigPackageUtility::ActivateStage(package, stage);
}
}
String ConfigObjectUtility::EscapeName(const String& name)
{
return Utility::EscapeString(name, "<>:\"/\\|?*", true);
@ -88,16 +147,7 @@ String ConfigObjectUtility::CreateObjectConfig(const Type::Ptr& type, const Stri
bool ConfigObjectUtility::CreateObject(const Type::Ptr& type, const String& fullName,
const String& config, const Array::Ptr& errors, const Array::Ptr& diagnosticInformation)
{
{
boost::mutex::scoped_lock lock(ConfigPackageUtility::GetStaticPackageMutex());
if (!ConfigPackageUtility::PackageExists("_api")) {
ConfigPackageUtility::CreatePackage("_api");
String stage = ConfigPackageUtility::CreateStage("_api");
ConfigPackageUtility::ActivateStage("_api", stage);
}
}
CreateStorage();
ConfigItem::Ptr item = ConfigItem::GetByTypeAndName(type, fullName);

View File

@ -23,6 +23,8 @@ class ConfigObjectUtility
public:
static String GetConfigDir();
static String GetObjectConfigPath(const Type::Ptr& type, const String& fullName);
static void RepairPackage(const String& package);
static void CreateStorage();
static String CreateObjectConfig(const Type::Ptr& type, const String& fullName,
bool ignoreOnError, const Array::Ptr& templates, const Dictionary::Ptr& attrs);

View File

@ -265,7 +265,7 @@ String ConfigPackageUtility::GetActiveStageFromFile(const String& packageName)
fp.close();
if (fp.fail())
BOOST_THROW_EXCEPTION(std::invalid_argument("Cannot detect active stage for package '" + packageName + "'. Broken config package, check the troubleshooting documentation."));
return ""; /* Don't use exceptions here. The caller must deal with empty stages at this point. Happens on initial package creation for example. */
return stage.Trim();
}
@ -283,13 +283,16 @@ void ConfigPackageUtility::SetActiveStageToFile(const String& packageName, const
String ConfigPackageUtility::GetActiveStage(const String& packageName)
{
String activeStage;
ApiListener::Ptr listener = ApiListener::GetInstance();
/* config packages without API make no sense. */
/* If we don't have an API feature, just use the file storage without caching this.
* This happens when ScheduledDowntime objects generate Downtime objects.
* TODO: Make the API a first class citizen.
*/
if (!listener)
BOOST_THROW_EXCEPTION(std::invalid_argument("No ApiListener instance configured."));
String activeStage;
return GetActiveStageFromFile(packageName);
/* First use runtime state. */
try {
@ -301,8 +304,6 @@ String ConfigPackageUtility::GetActiveStage(const String& packageName)
/* When we've read something, correct memory. */
if (!activeStage.IsEmpty())
listener->SetActivePackageStage(packageName, activeStage);
else
BOOST_THROW_EXCEPTION(std::invalid_argument("Cannot detect active stage for package '" + packageName + "'. Broken config package, check the troubleshooting documentation."));
}
return activeStage;
@ -310,16 +311,16 @@ String ConfigPackageUtility::GetActiveStage(const String& packageName)
void ConfigPackageUtility::SetActiveStage(const String& packageName, const String& stageName)
{
/* Update the marker on disk for restarts. */
SetActiveStageToFile(packageName, stageName);
ApiListener::Ptr listener = ApiListener::GetInstance();
/* config packages without API make no sense. */
/* No API, no caching. */
if (!listener)
BOOST_THROW_EXCEPTION(std::invalid_argument("No ApiListener instance configured."));
return;
listener->SetActivePackageStage(packageName, stageName);
/* Also update the marker on disk for restarts. */
SetActiveStageToFile(packageName, stageName);
}
std::vector<std::pair<String, bool> > ConfigPackageUtility::GetFiles(const String& packageName, const String& stageName)