Return code is not 0 как исправить

Stuck with Non-Zero return code: Ansible error? We can help you.

Ansible will fail if the exit status of a task is any non-zero value.

As part of our Server Management Services, we assist our customers with several Ansible queries.

Today, let us see how we can fix this error.

Non-Zero return code: Ansible

Generally, the error looks like this:

TASK [Non-Zero return] 
**********************************************************************************
fatal: [server1.lab.com]: FAILED! => {“changed”: true, “cmd”: “ls | grep wp-config.php”, “delta”: “0:00:00.021103”, “end”: “2021-06-29 12:53:49.222176”, “msg”: “non-zero return code”, “rc”: 1, “start”: “2021-06-29 12:53:49.201073”, “stderr”: “”, “stderr_lines”: [], “stdout”: “”, “stdout_lines”: []}
fatal: [server2.lab.com]: FAILED! => {“changed”: true, “cmd”: “ls | grep wp-config.php”, “delta”: “0:00:00.021412”, “end”: “2021-06-29 12:53:50.697567”, “msg”: “non-zero return code”, “rc”: 1, “start”: “2021-06-29 12:53:50.676155”, “stderr”: “”, “stderr_lines”: [], “stdout”: “”, “stdout_lines”: []}
fatal: [server3.lab.com]: FAILED! => {“changed”: true, “cmd”: “ls | grep wp-config.php”, “delta”: “0:00:00.015554”, “end”: “2021-06-29 12:53:50.075555”, “msg”: “non-zero return code”, “rc”: 1, “start”: “2021-06-29 12:53:50.060001”, “stderr”: “”, “stderr_lines”: [], “stdout”: “”, “stdout_lines”: []}

In common, if a command exits with a zero exit status it means it has run successfully.

On the other hand, any non-zero exit status of the command indicates an error.

For example,

$ date
Tuesday 29 June 2021 05:21:28 PM IST
$ echo $?
0

Here, we can see the successful execution of the shell command “date”. Hence, the exit status of the command is 0.

A non-zero exit status indicates failure. For example,

$ date yesterday
date: invalid date ‘yesterday’
$ echo $?
1

Here, the argument for the ‘date’ command, “yesterday”, is invalid. Hence, the exit status is 1, indicating the command ended in error.

However, though we execute properly, there are some commands which return a non-zero value.

$ ls | grep wp-config.php
$ echo $?
1

Here, the wp-config.php file doesn’t exist in that directory. Even though the command executes without error, the exit status is 1.

By default, Ansible will report it as failed.

How to resolve the problem?

The best practice in order to solve this is to avoid the usage of shell command in the playbook.

Instead of the shell command, there is a high chance for an ansible module that does the same operation.

So, we can use the ansible built-in module find which allows locating files easily through ansible.

Alternatively, we can define the condition for a failure at the task level with the help of failed_when.

For example,

—
– hosts: all
tasks:
– name: Non-Zero return
shell: “ls | grep wp-config.php”
register: wp
failed_when: “wp.rc not in [ 0, 1 ]”
TASK [Non-Zero return]
***********************************************************************************************************
changed: [server1.lab.com]
changed: [server2.lab.com]
changed: [server3.lab.com]

Though the exit status is not Zero, the task continues to execute on the server.

Here, the exit status registers to a variable and then pass through the condition. If the return value doesn’t match the condition, only then the task will report as a failure.

On the other hand, we can ignore the errors altogether.

For that, we use ignore_errors in the task to ignore any failure during the task.

—
– hosts: all
tasks:
– name: Non-Zero return
shell: “ls | grep wp-config.php”
ignore_errors: true
ASK [Non-Zero return]
***********************************************************************************************************
fatal: [server1.lab.com]: FAILED! => {“changed”: true, “cmd”: “ls | grep wp-config.php”, “delta”: “0:00:00.004055”, “end”: “2021-06-29 13:09:20.631570”, “msg”: “non-zero return code”, “rc”: 1, “start”: “2021-06-29 13:09:20.627515”, “stderr”: “”, “stderr_lines”: [], “stdout”: “”, “stdout_lines”: []}
…ignoring
fatal: [server2.lab.com]: FAILED! => {“changed”: true, “cmd”: “ls | grep wp-config.php”, “delta”: “0:00:00.006745”, “end”: “2021-06-29 13:09:22.110059”, “msg”: “non-zero return code”, “rc”: 1, “start”: “2021-06-29 13:09:22.103314”, “stderr”: “”, “stderr_lines”: [], “stdout”: “”, “stdout_lines”: []}
…ignoring
fatal: [server3.lab.com]: FAILED! => {“changed”: true, “cmd”: “ls | grep wp-config.php”, “delta”: “0:00:00.004957”, “end”: “2021-06-29 13:09:21.465326”, “msg”: “non-zero return code”, “rc”: 1, “start”: “2021-06-29 13:09:21.460369”, “stderr”: “”, “stderr_lines”: [], “stdout”: “”, “stdout_lines”: []}
…ignoring

By default, for ansible to recognize the task complition, the exit status must be Zero. Otherwise, it will fail.

We can manipulate the exit status of the task by registering the return value to a variable and then use conditional to determine if the task fails or succeeds.

To continue the playbook, in spite of the failure, we can use the ignore_errors option on the task.

[Confused with the procedure? We are here for you]

Conclusion

In short, we saw how our Support Techs fix the Ansible error for our customers.

PREVENT YOUR SERVER FROM CRASHING!

Never again lose customers to poor server speed! Let us help you.

Our server experts will monitor & maintain your server 24/7 so that it remains lightning fast and secure.

GET STARTED

var google_conversion_label = “owonCMyG5nEQ0aD71QM”;


Ansible – Resolve “non-zero return code”


non-zero return code is displayed when using the shell module and the return code is something other than 0. For example, the following shell command will almost always have a return code of 1.

---
- hosts: all
  tasks:
    - name: ps command
      shell: ps | grep foo

Running this playbook will return the following.

PLAY [all]

TASK [Gathering Facts]
ok: [server1.example.com]

TASK [ps command]
fatal: [server1.example.com]: FAILED! => {"changed": true, "cmd": "ps | grep foo", "delta": "0:00:00.021343", "end": "2020-03-13 21:52:36.185781", "msg": "non-zero return code", "rc": 1, "start": "2020-03-13 21:52:36.164438", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

PLAY RECAP
server1.example.com : ok=1  changed=0  unreacable=0  failed=1

Since a return code of 0 and 1 are ok with the ps command, the failed_when parameter can be used to fail when the rc (return code) is not 0 or 1.

---
- hosts: all
  tasks:
    - name: ps command
      shell: ps | grep foo
      register: ps
      failed_when: ps.rc not in [ 0, 1 ]
...

Or the ignore_errors parameter can be used.

---
- hosts: all
  tasks:
    - name: ps command
      shell: ps | grep foo
      ignore_errors: true
...

Or the meta: clear_host_errors module can be used.

---
- hosts: all
  tasks:
    - name: ps command
      shell: ps | grep foo

    - meta: clear_host_error
...


Did you find this article helpful?

If so, consider buying me a coffee over at Buy Me A Coffee

I have a framework written in python, and for testing purposes I basically want to do a subprocess (aka shell call) … that should simply come back with a RC != 0. I tried to invoke some non-existing executable; or to run “exit 1”; but those are for some reason translated to a FileNotFoundError.

So, what else could I do to trigger a return code != 0 (in a “reliable” way; meaning the command should not suddenly return 0 at a future point in time).

I thought to “search” for a binary called exit, but well:

> /usr/bin/env exit
/usr/bin/env: exit: No such file or directory

Byte Commander's user avatar

asked Apr 10, 2015 at 13:07

GhostCat's user avatar

10

If you’re looking for a system command that always returns a non-zero exit code, then /bin/false seems like it should work for you. From man false:

NAME
       false - do nothing, unsuccessfully

SYNOPSIS
       false [ignored command line arguments]
       false OPTION

DESCRIPTION
       Exit with a status code indicating failure.

answered Apr 10, 2015 at 13:58

steeldriver's user avatar

steeldriversteeldriver

130k21 gold badges229 silver badges316 bronze badges

3

You can create a new return code with the command bash -c "exit RETURNCODE", replacing “RETURNCODE” with any number. Note that it will be trimmed to an 8bit unsigned integer (0…255) by (RETURNCODE mod 256)

You can check the return code of the last shell command inside the terminal(!) with executing echo $?. The “$?” variable contains the most recent return code and “echo” prints it to the standard output.

answered Apr 10, 2015 at 13:42

Byte Commander's user avatar

Byte CommanderByte Commander

105k45 gold badges281 silver badges421 bronze badges

1

After some more testing, I found that my problem was not on the “Linux” side.

Python has a module shlex; which should be used to “split” command strings. When I changed my subprocess call to use the output of shlex.split() invoking “bash exit 1” gives me what I need.

answered Apr 10, 2015 at 13:25

GhostCat's user avatar

GhostCatGhostCat

2,0757 gold badges27 silver badges40 bronze badges

I have an ansible task:

  - name: Get vault's binary path
    shell: type -p vault
    register: vault_binary_path

returns

 TASK [update_vault : Get vault's binary path] **********************************************************************************************************************************************************************
fatal: [xxxxx]: FAILED! => {"changed": true, "cmd": "type -p vault", "delta": "0:00:00.003303", "end": "2020-04-08 11:37:19.636528", "msg": "non-zero return code", "rc": 1, "start": "2020-04-08 11:37:19.633225", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

but when I run it in shell it returns just fine

[root@ip-xxxxx]# type -p vault
/usr/local/bin/vault

I run ansible as root with become: true. All previous steps are fine up until this one. Any advice appreciated.

When Ansible receives a non-zero return code from a command or a failure from a module, by default it stops executing on that host and continues on other hosts. However, in some circumstances you may want different behavior. Sometimes a non-zero return code indicates success. Sometimes you want a failure on one host to stop execution on all hosts. Ansible provides tools and settings to handle these situations and help you get the behavior, output, and reporting you want.

  • Ignoring failed commands
  • Ignoring unreachable host errors
  • Resetting unreachable hosts
  • Handlers and failure
  • Defining failure
  • Defining “changed”
  • Ensuring success for command and shell
  • Aborting a play on all hosts

    • Aborting on the first error: any_errors_fatal
    • Setting a maximum failure percentage
  • Controlling errors in blocks

Ignoring failed commands

By default Ansible stops executing tasks on a host when a task fails on that host. You can use ignore_errors to continue on in spite of the failure:

- name: Do not count this as a failure
  ansible.builtin.command: /bin/false
  ignore_errors: yes

The ignore_errors directive only works when the task is able to run and returns a value of ‘failed’. It does not make Ansible ignore undefined variable errors, connection failures, execution issues (for example, missing packages), or syntax errors.

Ignoring unreachable host errors

New in version 2.7.

You can ignore a task failure due to the host instance being ‘UNREACHABLE’ with the ignore_unreachable keyword. Ansible ignores the task errors, but continues to execute future tasks against the unreachable host. For example, at the task level:

- name: This executes, fails, and the failure is ignored
  ansible.builtin.command: /bin/true
  ignore_unreachable: yes

- name: This executes, fails, and ends the play for this host
  ansible.builtin.command: /bin/true

And at the playbook level:

- hosts: all
  ignore_unreachable: yes
  tasks:
  - name: This executes, fails, and the failure is ignored
    ansible.builtin.command: /bin/true

  - name: This executes, fails, and ends the play for this host
    ansible.builtin.command: /bin/true
    ignore_unreachable: no

Resetting unreachable hosts

If Ansible cannot connect to a host, it marks that host as ‘UNREACHABLE’ and removes it from the list of active hosts for the run. You can use meta: clear_host_errors to reactivate all hosts, so subsequent tasks can try to reach them again.

Handlers and failure

Ansible runs handlers at the end of each play. If a task notifies a handler but another task fails later in the play, by default the handler does not run on that host, which may leave the host in an unexpected state. For example, a task could update a configuration file and notify a handler to restart some service. If a task later in the same play fails, the configuration file might be changed but the service will not be restarted.

You can change this behavior with the --force-handlers command-line option, by including force_handlers: True in a play, or by adding force_handlers = True to ansible.cfg. When handlers are forced, Ansible will run all notified handlers on all hosts, even hosts with failed tasks. (Note that certain errors could still prevent the handler from running, such as a host becoming unreachable.)

Defining failure

Ansible lets you define what “failure” means in each task using the failed_when conditional. As with all conditionals in Ansible, lists of multiple failed_when conditions are joined with an implicit and, meaning the task only fails when all conditions are met. If you want to trigger a failure when any of the conditions is met, you must define the conditions in a string with an explicit or operator.

You may check for failure by searching for a word or phrase in the output of a command:

- name: Fail task when the command error output prints FAILED
  ansible.builtin.command: /usr/bin/example-command -x -y -z
  register: command_result
  failed_when: "'FAILED' in command_result.stderr"

or based on the return code:

- name: Fail task when both files are identical
  ansible.builtin.raw: diff foo/file1 bar/file2
  register: diff_cmd
  failed_when: diff_cmd.rc == 0 or diff_cmd.rc >= 2

You can also combine multiple conditions for failure. This task will fail if both conditions are true:

- name: Check if a file exists in temp and fail task if it does
  ansible.builtin.command: ls /tmp/this_should_not_be_here
  register: result
  failed_when:
    - result.rc == 0
    - '"No such" not in result.stdout'

If you want the task to fail when only one condition is satisfied, change the failed_when definition to:

failed_when: result.rc == 0 or "No such" not in result.stdout

If you have too many conditions to fit neatly into one line, you can split it into a multi-line yaml value with >:

- name: example of many failed_when conditions with OR
  ansible.builtin.shell: "./myBinary"
  register: ret
  failed_when: >
    ("No such file or directory" in ret.stdout) or
    (ret.stderr != '') or
    (ret.rc == 10)

Defining “changed”

Ansible lets you define when a particular task has “changed” a remote node using the changed_when conditional. This lets you determine, based on return codes or output, whether a change should be reported in Ansible statistics and whether a handler should be triggered or not. As with all conditionals in Ansible, lists of multiple changed_when conditions are joined with an implicit and, meaning the task only reports a change when all conditions are met. If you want to report a change when any of the conditions is met, you must define the conditions in a string with an explicit or operator. For example:

tasks:

  - name: Report 'changed' when the return code is not equal to 2
    ansible.builtin.shell: /usr/bin/billybass --mode="take me to the river"
    register: bass_result
    changed_when: "bass_result.rc != 2"

  - name: This will never report 'changed' status
    ansible.builtin.shell: wall 'beep'
    changed_when: False

You can also combine multiple conditions to override “changed” result:

- name: Combine multiple conditions to override 'changed' result
  ansible.builtin.command: /bin/fake_command
  register: result
  ignore_errors: True
  changed_when:
    - '"ERROR" in result.stderr'
    - result.rc == 2

See Defining failure for more conditional syntax examples.

Ensuring success for command and shell

The command and shell modules care about return codes, so if you have a command whose successful exit code is not zero, you can do this:

tasks:
  - name: Run this command and ignore the result
    ansible.builtin.shell: /usr/bin/somecommand || /bin/true

Aborting a play on all hosts

Sometimes you want a failure on a single host, or failures on a certain percentage of hosts, to abort the entire play on all hosts. You can stop play execution after the first failure happens with any_errors_fatal. For finer-grained control, you can use max_fail_percentage to abort the run after a given percentage of hosts has failed.

Aborting on the first error: any_errors_fatal

If you set any_errors_fatal and a task returns an error, Ansible finishes the fatal task on all hosts in the current batch, then stops executing the play on all hosts. Subsequent tasks and plays are not executed. You can recover from fatal errors by adding a rescue section to the block. You can set any_errors_fatal at the play or block level:

- hosts: somehosts
  any_errors_fatal: true
  roles:
    - myrole

- hosts: somehosts
  tasks:
    - block:
        - include_tasks: mytasks.yml
      any_errors_fatal: true

You can use this feature when all tasks must be 100% successful to continue playbook execution. For example, if you run a service on machines in multiple data centers with load balancers to pass traffic from users to the service, you want all load balancers to be disabled before you stop the service for maintenance. To ensure that any failure in the task that disables the load balancers will stop all other tasks:

---
- hosts: load_balancers_dc_a
  any_errors_fatal: true

  tasks:
    - name: Shut down datacenter 'A'
      ansible.builtin.command: /usr/bin/disable-dc

- hosts: frontends_dc_a

  tasks:
    - name: Stop service
      ansible.builtin.command: /usr/bin/stop-software

    - name: Update software
      ansible.builtin.command: /usr/bin/upgrade-software

- hosts: load_balancers_dc_a

  tasks:
    - name: Start datacenter 'A'
      ansible.builtin.command: /usr/bin/enable-dc

In this example Ansible starts the software upgrade on the front ends only if all of the load balancers are successfully disabled.

Setting a maximum failure percentage

By default, Ansible continues to execute tasks as long as there are hosts that have not yet failed. In some situations, such as when executing a rolling update, you may want to abort the play when a certain threshold of failures has been reached. To achieve this, you can set a maximum failure percentage on a play:

---
- hosts: webservers
  max_fail_percentage: 30
  serial: 10

The max_fail_percentage setting applies to each batch when you use it with serial. In the example above, if more than 3 of the 10 servers in the first (or any) batch of servers failed, the rest of the play would be aborted.

Note

The percentage set must be exceeded, not equaled. For example, if serial were set to 4 and you wanted the task to abort the play when 2 of the systems failed, set the max_fail_percentage at 49 rather than 50.

Controlling errors in blocks

You can also use blocks to define responses to task errors. This approach is similar to exception handling in many programming languages. See Handling errors with blocks for details and examples.


Ansible

  • Controlling where tasks run: delegation and local actions

    By default Ansible gathers facts and executes all tasks on machines that match hosts line of your playbook.

  • Setting the remote environment

    New in version 1.1.

  • Using filters to manipulate data

    Filters let you transform JSON data into YAML split URL extract the hostname, get SHA1 hash of string, add multiply integers, and much more.

  • Combining and selecting data

    You can combine data from multiple sources and types, select values large structures, giving precise control over complex New in version 2.3.

Добавить комментарий